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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

(i) APPLICANT: Murphy, Patricia D. 

Allen, Antonette C. 
Alvares, Christopher P. 
Critz, Brenda S. 
Olson, Sheri J. 
Schelter, Denise B. 
Zeng, Bin 



(ii). TITLE OF INVENTION: A Sequence /f the Human BRCA1 Gene 
[iii) NUMBER OF SEQUENCES: 78 



(iv) CORRESPONDENCE ADDRESS: 

( A ) ADDRESSEE : ONCORMED 

(B) STREET: 200 PERRY PJ 

(C) CITY: GAITHERSBURG 

(D) STATE: MD 

(E) COUNTRY: USA 

(F) ZIP: 20877 




(v) COMPUTER READABLE FOI 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM yPC compatible 

'(C) OPERATING SYSfTEM: PC-DOS/MS-DOS 
(D) SOFTWARE: Pa/entln Release #1.0, Version #1.30 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: to be assigned 
.(B) FILING DATE: herewith 

(C) CLASSIFICATION : 



(viii) ATTORNEY/ AGE^JT INFORMATION: 

(A) NAME: JR. THOMAS GALLEGOS 

(B) REGISTRATION NUMBER: 32,692 
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(C) REFERENCE/ DOCKET NUMBER: PA-0054 

(ix) TELECOMMUNICATION INFORMATION:. - 

(A) TELEPHONE: 301-527-2051 

(B) TELEFAX: 301-208-6997 



(2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5711 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(B) STRAIN: BRCA1 

(viii) POSITION IN GENOME: 

(A) CHROMOSOME/ SEGMENT: 17 

(B) MAP POSITION: 17q21 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 : 
AGCTCGCTGA GACTTCCTGG ACCCCGCACC AGGCTGTGGG GTTTCTCAGA TAACTGGGCC 
CCTGCGCTCA GGAGGCCTTC ACCCTCTGCT CTGGGTAAAG TTCATTGGAA CAGAAAGAAA 
TGGATTTATC TGCTCTTCGC GTTGAAGAAG TACAAAATGT CATTAATGCT ATGCAGAAAA 
TCTTAGAGTG TCCCATCTGT CTGGAGTTGA TCAAGGAACC TGTCTCCACA AAGTGTGACC 
ACATATTTTG CAAATTTTGC ATGCTGAAAC TTCTCAACCA GAAGAAAGGG CCTTCACAGT 
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GTCCTTTATG TAAGAATGAT ATAACCAAAA GGAGCCTACA AGAAAGTACG AGATTTAGTC 3 60 

AACTTGTTGA AGAGCTATTG AAAATCATTT GTGCTTTTCA GCTTGACACA GGTTTGGAGT 420 

ATGCAAACAG CTATAATTTT GCAAAAAAGG AAAATAACTC TCCTGAACAT CTAAAAGATG 480 

AAGTTTCTAT CATCCAAAGT ATGGGCTACA GAAACCGTGC CAAAAGACTT CTACAGAGTG 540 

AACCCGAAAA TCCTTCCTTG CAGGAAACCA GTCTCAGTGT CCAACTCTCT AACCTTGGAA 600 

CTGTGAGAAC TCTGAGGACA AAGCAGCGGA TACAACCTCA AAAGACGTCT GTCTACATTG 660 

AATTGGGATC TGATTCTTCT GAAGATACCG TTAATAAGGC AACTTATTGC AGTGTGGGAG 720 

ATCAAGAATT GTTACAAATC ACCCCTCAAG GAACCAGGGA TGAAATCAGT TTGGATTCTG 780 
CAAAAAAGGC TGCTTGTGAA TTTTCTGAGA CGGATGTAAC AAATACTGAA CATCATCAAC 840 

CCAGTAATAA TGATTTGAAC ACCACTGAGA AGCGTGCAGC TGAGAGGCAT CCAGAAAAGT S00 
ATCAGGGTAG TTCTGTTTCA AACTTGCATG TGGAGCCATG TGGCACAAAT ACTCATGCCA 960 

GCTCATTACA GCATGAGAAC AGCAGTTTAT TACTCACTAA AGACAGAATG AATGTAGAAA 1020 

AGGCTGAATT CTGTAATAAA AGCAAACAGC CTGGCTTAGC AAGGAGCCAA CATAACAGAT 1080 

GGGCTGGAAG TAAGGAAACA TGTAATGATA GGCGGACTCC CAGCACAGAA AAAAAGGTAG 1140 

ATCTGAATGC TGATCCCCTG TGTGAGAGAA AAGAATGGAA TAAGCAGAAA GTGCCATGCT 1200 

CAGAGAATCC TAG AG AT ACT GAAGATGTTC CTTGGATAAC ACTAAATAGC AGCATTCAGA 12 60 

AAGTTAATGA GTGGTTTTCC AGAAGTGATG AACTGTTAGG TTCTGATGAC TCACATGATG 13 20 

GGGAGTCTGA ATCAAATGCC AAAGTAGCTG ATGTATTGGA CGTTCTAAAT GAGGTAGATG 13 80 

AATATTCTGG • TTCTTCAGAG AAAATAGACT TACTGGCCAG TGATCCTCAT GAGGCTTTAA 144 0 

TATGTAAAAG TGAAAGAGTT CACTCCAAAT CAGTAGAGAG TAATATTGAA GACAAAATAT 1500 
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TTGGGAAAAC CTATCGGAAG AAGGCAAGCC TCCCCAACTT AAGCCATGTA ACTGAAAATC 15 60 

TAATTATAGG AGCATTTGTT ACTGAGCCAC AGATAATACA AGAGCGTCCC CTCACAAATA 1620 

AATTAAAGCG TAAAAGGAGA CCTACATCAG GCCTTCATCC TGAGGATTTT ATCAAGAAAG -1680 

CAGATTTGGC AGTTCAAAAG ACTCCTGAAA TGATAAATCA GGGAACTAAC CAAACGGAGC 1740 

AGAATGGTCA AGTGATGAAT ATTACTAATA GTGGTCATGA GAATAAAACA AAAGGTGATT 1800 

CTATTCAGAA TGAGAAAAAT CCTAACCCAA TAGAATCACT CGAAAAAGAA TCTGCTTTCA 1860 

AAACGAAAGC TGAACCTATA AGCAGCAGTA TAAGCAATAT GGAACTCGAA TTAAATATCC . 1920 

ACAATTCAAA AGCACCTAAA AAGAATAGGC TGAGGAGGAA GTCTTCTACC AGGCATATTC 1980 

ATGCGCTTGA ACTAGTAGTC AGTAGAAATC TAAGCCCACC TAATTGTACT GAATTGCAAA 2040 

TTGATAGTTG TTCTAGCAGT GAAGAGATAA AGAAAAAAAA GTACAACCAA ATGCCAGTCA 2100 

GGCACAGCAG AAACCTACAA CTCATGGAAG GTAAAGAACC TGCAACTGGA GCCAAGAAGA -2160 

GTAACAAGCC AAATGAACAG ACAAGTAAAA GACATGACAG TGATACTTTC CCAGAGCTGA 2220 

AGTTAACAAA TGCACCTGGT TCTTTTACTA AGTGTTCAAA TACCAGTGAA CTTAAAGAAT 2280 

TTGTCAATCC TAGCCTTCCA AGAGAAGAAA AAGAAGAGAA ACTAGAAACA GTTAAAGTGT 23 40 

CTAATAATGC TGAAGACCCC AAAGATCTCA TGTTAAGTGG AGAAAGGGTT • TTGCAAACTG 2400 ■ 

AAAGATCTGT AGAGAGTAGC AGTATTTCAC TGGTACCTGG TACTGATTAT GGCACTCAGG 24 60 

AAAGTATCTC GTTACTGGAA GTTAGCACTC TAGGGAAGGC AAAAACAGAA CCAAATAAAT ; 2520 

GTGTGAGTCA GTGTGCAGCA TTTGAAAACC CCAAGGGACT AATTCATGGT TGTTCCAAAG 25 80 

ATAATAGAAA TGACACAGAA GGCTTTAAGT ATCCATTGGG ACATGAAGTT AACCACAGTC 2640 

GGGAAACAAG CATAGAAATG GAAGAAAGTG AACTTGATGC TCAGTATTTG CAGAATACAT 2700 
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TCAAGGTTTC AAAGCGCCAG TCATTTGCTC TGTTTTCAAA TCCAGGAAAT . GCAGAAGAGG '27 60 

AATGTGCAAC ATTCTCTGCC CACTCTGGGT CCTTAAAGAA ACAAAGTCCA AAAGTCACTT 2820 

TTGAATGTGA ACAAAAGGAA GAAAATCAAG GAAAGAATGA GTCTAATATC AAGCCTGTAC 2880 

AGACAGTTAA TATCACTGCA GGCTTTCCTG TGGTTGGTCA GAAAGATAAG CCAGTTGATA 2 94Q 

ATGCCAAATG TAGTATCAAA GGAGGCTCTA GGTTTTGTCT ATCATCTCAG TTCAGAGGCA 3 000 

ACGAAACTGG ACTCATTACT CCAAATAAAC ATGGACTTTT ACAAAACGCA TATCGTATAC 3 0 60 

CACCACTTTT TCCCATCAAG TCATTTGTTA AAACTAAATG TAAGAAAAAT CTGCTAGAGG 3120 

AAAACTTTGA GGAACA.TTCA ATGTCACCTG AAAGAGAAAT GGGAAATGAG AACATTCCAA 3180 

GTACAGTGAG CACAATTAGC CGTAATAACA TTAGAGAAAA TGTTTTTAAA GGAGCCAGCT 3240 

CAAGCAATAT TAATGAAGTA GGTTCCAGTA CTAATGAAGT GGGCTCCAGT ATTAATGAAA 33 00 

TAGGTTCCAG TGATGAAAAC ATTCAAGCAG AACTAGGTAG AAACAGAGGG CCAAAATTGA 33*60 

ATGCTATGCT TAGATTAGGG GTTTTGCAAC CTGAGGTCTA TAAACAAAGT CTTCCTGGAA 3420 

GTAATTGTAA GCATCCTGAA ATAAAAAAGC AAGAATATGA AGAAGTAGTT CAGACTGTTA 3430 

ATACAGATTT CTCTCCATAT CTGATTTCAG ATAACTTAGA ACAGCCTATG GGAAGTAGTC 3540 

ATGCATCTCA GGTTTGTTCT GAGACACCTG ATGACCTGTT AGATGATGGT " GAAATAAAGG 3600 

AAGATACTAG TTTTGCTGAA AATGACATTA AGGAAAGTTC TGCTGTTTTT AGCAAAAGCG 3 6 60 

TCCAGAGAGG AGAGCTTAGC AGGAGTCCTA GCCCTTTCAC CCATACACAT TTGGCTCAGG 37 20 

GTTACCGAAG AGGGGCCAAG AAATTAGAGT CCTCAGAAGA GAACTTATCT AGTGAGGATG 3 730 

AAGAGCTTCC CTGCTTCCAA CACTTGTTAT TTGGTAAAGT AAACAATATA .CCTTCTCAGT 3 840 

CTACTAGGCA TAGCACCGTT GCTACCGAGT GTCTGT.CTAA GAACACAGAG GAGAATTTAT 3 900 
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TATCATTGAA GAATAGCTTA AATGACTGCA GTAACCAGGT AATATTGGCA AAGGCATCTC 3 9 60 

AGGAACATCA CCTTAGTGAG GAAACAAAAT GTTCTGCTAG CTTGTTTTCT TCACAGTGCA 402 0 

GTGAATTGGA AGACTTGACT GCAAATACAA ACACCCAGGA TCCTTTCTTG ATTGGTTCTT 40 80- 

CCAAACAAAT GAGGCATCAG TCTGAAAGCC AGGGAGTTGG TCTGAGTGAC AAGGAATTGG 4140 
TTTCAGATGA TGAAGAAAGA GGAACGGGCT TGGAAGAAAA TAATCAAGAA GAGCAAAGCA . 4200 

TGGATTCAAA CTTAGGTGAA GCAGCATCTG GGTGTGAGAG TGAAACAAGC GTCTCTGAAG 42 60 

ACTGCTCAGG GCTATCCTCT CAGAGTGACA TTfTAACCAC TCAGCAGAGG GATACCATGC 43 20 

AACATAACCT GATAAAGCTC CAGCAGGAAA TGGCTGAACT AGAAGCTGTG TTAGAACAGC 43 80 

ATGGGAGCCA GCCTTCTAAC AGCTACCCTT CCATCATAAG TGACTCCTCT GCCCTTGAGG 4440 

ACCTGCGAAA TCCAGAACAA AGCACATCAG AAAAAGCAGT ATTAACTTCA CAGAAAAGTA 4500 

GTGAATACCC TATAAGCCAG AATCCAGAAG GCCTTTCTGC TGACAAGTTT GAGGTGTCTG 45 60 

CAGATAGTTC TACCAGTAAA AATAAAGAAC CAGGAGTGGA AAGGTCATCC CCTTCTAAAT 4620 

GCCCATCATT. AGATGATAGG TGGTACATGC ACAGTTGCTC TGGGAGTCTT CAGAATAGAA 4680 

ACTACCCATC TCAAGAGGAG CTCATTAAGG TTGTTGATGT GGAGGAGCAA CAGCTGGAAG "4740 

AGTCTGGGCC ACACGATTTG ACGGAAACAT CTTACTTGCC AAGGCAAGAT CTAGAGGGAA 4800 

CCCCTTACCT GGAATCTGGA ATCAGCCTCT TCTCTGATGA CCCTGAATCT GATCCTTCTG 48 60 

AAGACAGAGC CCCAGAGTCA GCTCGTGTTG GCAACATACC ATCTTCAACC TCTGCATTGA 4920 

AAG.TTCCCCA ATTGAAAGTT GCAGAATCTG CCCAGGGTCC AGCTGCTGCT CATACTACTG 4980 

ATACTGCTGG GTATAATGCA ATGGAAGAAA GTGTGAGCAG GGAGAAGCCA GAATTGACAG 5040 

CTTCAACAGA AAGGGTCAAC AAAAGAATGT CCATGGTGGT GTCTGGCCTG ACCCCAGAAG 5100 
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AATTTATGCT CGTGTACAAG TTTGCCAGAA AACACCACAT CACTTTAACT AATCTAATTA 5160 

CTGAAGAGAC TACTCATGTT GTTATGAAAA CAGATGCTGA GTTTGTGTGT GAACGGACAC 522 0 

TGAAATATTT TCTAGGAATT GCGGGAGGAA AATGGGTAGT TAGCTATTTC TGGGTGACCC 5280 

AGTCTATTAA AGAAAGAAAA ATGCTGAATG AGCATGATTT TGAAGTCAGA GGAGATGTGG 5340 

TCAATGGAAG AAACCACCAA GGTCCAAAGC GAGCAAGAGA ATCCCAGGAC AGAAAGATCT 5400 

TCAGGGGGCT AGAAATCTGT TGCTATGGGC CCTTCACCAA CATGCCCACA GATCAACTGG 5460 

AATGGATGGT ACAGCTGTGT GGTGCTTCTG TGGTGAAGGA GCTTTCATCA TTCACCCTTG 5520 

GCACAGGTGT CCACCCAATT GTGGTTGTGC AGCCAGATGC CTGGACAGAG GACAATGGCT 5580 

TCCATGCAAT TGGGCAGATG TGTGAGGCAC CTGTGGTGAC. CCGAGAGTGG GTGTTGGACA 5640 

GTGTAGCACT - CTACCAGTGC CAGGAGCTGG ACACCTACCT GATACCCCAG ATCCCCCACA 5700 

GCCACTACTG A 5711 

(2) INFORMATION FOR SEQ ID NO: 2: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1863 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(B) STRAIN: BRCA1 

(viii) POSITION IN GENOME: 

(A) CHROMOSOME / SEGMENT : 17 

(B) MAP POSITION: 17q21 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO:2: 

Met- Asp Leu Ser Ala Leu Arg Val Glu Glu Val Gin Asn Val H e Asn 

Ala Met Gin Lys lie Leu Glu Cys Pro He Cys Leu Glu Leu He Lys 
20 25 30 

Glu Pro Val Ser Thr Lys Cys Asp His He Phe Cys Lys Phe Cys Met 

35 40 « 

Leu Lys Leu Leu Asn Gin Lys Lys Gly Pro Ser Gin Cys Pro Leu Cys 

Lys Asn Asp He Thr Lys Arg Ser Leu Gin Glu Ser Thr Arg Phe Ser 

70 75 

. ■ 75 80 

Gin Leu Val Glu Glu Leu Leu Lys lie He Cys Ala Phe Gin Leu Asp 



90 



95 



Thr Gly Leu Glu Tyr Ala Asn Ser Tyr Asn. Phe Ala Lys Lys Glu Asn 



100 



105 



110 



Asn Ser Pro Glu His Leu Lys Asp Glu Val Ser He He Gin Ser Met 



115 



120 



125 



Gly Tyr Arg Asn Arg Ala Lys Arg Leu Leu Gin Ser Glu Pro Glu Asn 



130 



135 



140 



Pro Ser Leu Gin Glu Thr Ser Leu Ser Val Gin Leu Ser Asn Leu Gly 



145 



150 



155 



160 



Thr Val Arg Thr Leu Arg Thr Lys Gin Arg lie Gin Pro Gin Lys Thr 



165 



170 



175 



Ser Val Tyr He Glu Leu Gly Ser Asp Ser Ser Glu Asp Thr Val Asn 

190 



180 lg5 
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Lys Ala Thr Tyr Cys Ser Val Gly Asp Gin Glu Leu Leu Gin lie Thr 
195 200 " 205 



Pro Gin Gly Thr Arg Asp Glu He Ser . Leu Asp Ser Ala Lys Lys ai £ 
210 215 



220 



Ala Cys Glu Phe Ser Glu Thr Asp Val Thr Asn Thr Glu His His Gin 
225 230 235 240 

Pro Ser Asn Asn Asp Leu Asn Thr Thr Glu Lys Arg Ala Ala Glu Arg 
245 250 255 

His Pro Glu Lys Tyr Gin Gly Ser Ser Val Ser Asn Leu His Val Glu 
260 265 270 

Pro Cys Gly Thr Asn Thr His Ala Ser Ser Leu Gin His Glu Asn Ser 
275 280 285 

Ser Leu Leu Leu Thr Lys Asp Arg Met Asn Val Glu Lys Ala Glu P* e 
290 295 300 



Cys Asn Lys Ser Lys Gin Pro Gly Leu Ala Arg Ser Gin His Asn Arg 
305 310 



315 



320 



Trp Ala Gly Ser Lys Glu Thr Cys Asn Asp Arg Arg Thr Pro Ser Thr 
325 330 335 



Glu Lys Lys Val Asp Leu Asn Ala Asp Pro Leu Cys Glu Arg Lys Glu 
340 345 



350 



Trp Asn Lys Gin Lys Leu Pro Cys Ser Glu Asn Pro Arg Asp Thr Glu 
355 360 365 

Asp Val Pro Trp He Thr Leu Asn Ser Ser He Gin Lys Val Asn Glu 
370 375 380 

Trp Phe Ser Arg Ser Asp Glu Leu Leu Gly Ser Asp Asp Ser His Asp 

385 390 395 400 



Gly Glu Ser Glu Ser Asn Ala Lys Val Ala Asp Val Leu Asp Val Leu 
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405 



410 



415 



Asn Glu Val Asp Glu Tyr ser Gly Ser Ser Glu Lys He Asp Leu 



420 



Leu 



425 



430 



Ala Ser Asp Pro His Glu Ala Leu lie Cys Lys Ser Glu Arg Val His 
435 - 440 445 



Ser Lys Ser Val Glu Ser Asn lie Glu Asp Lys lie Phe Gly Lys Thr 



450 



455 



460 



Tyr Arg Lys Lys Ala Ser Leu Pro Asn Leu Ser His Val Thr Glu Asn 



465 



470 



475 



480 



Leu lie lie. Gly Ala Phe Val Thr Glu Pro Gin lie lie Gin Glu Arg 



485 



490 



495 



Pro Leu Thr Asn Lys Leu Lys Arg Lys Arg Arg Pro Thr Ser Gly Leu 



500 



505 



510 



His Pro Glu Asp Phe lie Lys Lys Ala Asp Leu Ala Val Gin Lys Thr 
515 520 525 

Pro Glu Met He Asn Gin Gly Thr Asn Gin Thr Glu Gin Asn Gly Gin 
530 535 540 



Val Met Asn He Thr Asn Ser Gly His Glu Asn Lys Thr Lys Gly Asp 
545 550 



555 



560 



Ser He Gin Asn Glu Lys Asn Pro Asn Pro He Glu 



565 



570 



Ser Leu Glu Lys 
575 



Glu Ser Ala Phe Lys Thr Lys Ala Glu Pro He Ser Ser Ser He Ser 

580 585 590 

Asn Met Glu Leu Glu Leu Asn He His Asn Ser Lys Ala Pro Lys Lys 

595 600 605 



Asn Arg Leu Arg Arg Lys Ser Ser Thr Arg His He His Ala Leu Glu 
6 1° 615 620 
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Leu Val Val Ser Arg Asn Leu Ser Pro Pro Asn Cys Thr Glu Leu Gin 
625 630 " 635 640 

He Asp Ser Cys Ser Ser Ser Glu Glu lie Lys Lys Lys Lys Tyr Asn 



645 650 



655 



Gin Met Pro Val Arg His Ser Arg Asn Leu Gin Leu Met Glu Gly Lys 
660 665 670 

Glu Pro Ala Thr Gly Ala Lys Lys Ser Asn Lys Pro Asn Glu Gin Thr 
675 680 685 

Ser Lys Arg His Asp Ser Asp Thr Phe Pro Glu Leu Lys Leu Thr Asn 
690 695 700 

Ala Pro Gly Ser Phe Thr Lys Cys Ser Asn Thr Ser Glu Leu Lys Glu 

705 710 tic 

u '15 720 

Phe Val Asn Pro Ser Leu Pro Arg Glu Glu. Lys Glu Glu Lys Leu Glu 
725 730 735 

Thr Val Lys Val Ser Asn Asn Ala Glu Asp Pro Lys Asp Leu Met Leu 
740 745 750 

Ser Gly Glu Arg Val Leu Gin Thr Glu Arg Ser Val Glu Ser Ser Ser 
755 760 7 6 5 

He Ser Leu Val Pro Gly Thr Asp Tyr Gly Thr Gin Glu Ser lie Ser 
770 775 7 8 o 

Leu Leu Glu Val Ser Thr Leu Gly Lys Ala Lys Thr Glu Pro Asn Lys 
785 790 795 aoo 

Cys Val Ser Gin Cys Ala Ala Phe Glu Asn Pro Lys Gly Leu Ue His 
805 810 815 

Gly Cys Ser Lys Asp Asn Arg Asn Asp Thr Glu Gly Phe Lys Tyr Pro 
820 825 830 

Leu Gly His Glu Val Asn His Ser Arg Glu Thr Ser He Glu Met Glu 
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835 



840 



845 



Glu Ser Glu Leu Asp Ala Gin Tyr Leu Gin Asn Thr Phe Lys Val Ser 



855 



860 



Lys Arg Gin Ser Phe Ala Leu P he Ser Asn Pro Gly Asn Al a Glu Glu 



865 



870 



875 



880 



Glu Cys Ala Thr Phe Ser Ala His Ser Gly Ser Leu Lys Lys Gin Ser 



885 



890 



Pro Lys Val Thr Phe Glu Cys Glu Gin Lys Glu Glu Asn 



900 



905 



895 

Gin Gly Lys 
910 



Asn Glu ser Asn lie Lys Pro Val Gin Thr Val Asn lie Thr Ala Gly 
915 920 925 

Phe Pro Val Val Gly Gin Lys Asp Lys Pro Val Asp Asn Ala Lys Cys 
930 ■ 935 940 

Ser ne L ys Gly Gly Ser Arg Phe Cys Leu Ser Ser Gin Phe Arg Gly 



945 



950 



955 



Asn Glu Thr Gly Leu lie Thr Pro Asn Lys His Gly Leu 



965 



970 



960 

Leu Gin Asn 
975 



Pro Tyr Arg lie Pro Pro Leu Phe Pro lie Lys Ser Phe Val Lys Thr 



980 



985 



990 



Lys Cys Lys Lys Asn Leu Leu Glu Glu Asn Phe Glu Glu His Ser Met 



995 



1000 



1005 



Ser Pro Glu Arg. Glu Met Gly Asn Glu Asn lie Pro Ser Thr Val Ser 



1010 



1015 



1020 



Thr He ser Arg Asn Asn lie Arg Glu Asn Val Phe Lys Gly Ala Ser 
1025 1030 



1035 



1040 



Ser Ser Asn He Asn Glu Val Gly 



Ser Ser Thr Asn Glu Val Gly Ser 
1045 1050 1055 
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Ser lie Asn Glu lie Gly Ser Ser Asp Glu Asn lie Gin Ala Glu Leu 
1060 106? 1070 

Gly Arg Asn Arg Gly Pro L ys Leu Asn Ala Met Leu Arg Leu Gly Val 
1075 1080 1085 

Leu Gin Pro Glu Val Tyr Lys Gin Ser Leu Pro Gly Ser Asn Cys Lys 
1090 1095 1100 



His Pro Glu lie Lys Lys Gin Glu Tyr Glu Glu Val Val Gin Thr Val 
1105 1110 . HIS ' 112( 

Asn Thr Asp Phe Ser Pro Tyr Leu lie Ser Asp Asn Leu Glu Gin Pro 



1125 1130 



1135 



Met Gly Ser Ser His Ala Ser Gin Val Cys Ser Glu Thr Pro Asp Asp 
H40 U45 u 50 

Leu Leu Asp Asp Gly Glu lie Lys Glu Asp Thr Ser Phe Ala Glu Asn 
H55 1160 i 165 

Asp He Lys Glu Ser Ser Ala Val Phe Ser Lys Ser Val Gin Arg Gly 
1170 1175 ii 80 

Glu Leu Ser Arg Ser Pro Ser Pro Phe Thr His Thr His Leu Ala Gin 
11 i 85 1190 U95 i 200 

Gly Tyr Arg Arg Gly Ala Lys Lys Leu Glu Ser Ser Glu Glu Asn Leu 
1205 1210 1215 

Ser Ser Glu Asp Glu Glu Leu Pro Cys Phe Gin. His Leu Leu Phe Gly 
1220 1225 1230 

Lys Val Asn Asn He Pro Ser Gin Ser Thr Arg His Ser Thr Val Ala 
1235 1240 1245 

Thr Glu Cys Leu Ser Lys Asn Thr Glu Glu Asn. Leu Leu Ser Leu Lys 
1250 1255 1260 

Asn Ser Leu Asn Asp Cys Ser Asn Gin Val lie Leu Ala Lys Ala Ser 
1265 1270 1275 ' i2so 
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Gin Glu.His His Leu Ser Glu Glu Thr-Lys Cys Ser Ala Ser Leu Phe 
1285 1290 1295 

Ser Ser Gin Cys Ser Glu Leu Glu Asp Leu Thr Ala Asn Thr Asn Thr 
1300 1305 1310 

Gin Asp Pro Phe Leu lie Gly Ser Ser Lys Gin Met Arg His Gin Ser 
"IS 1320 1325 

Glu Ser Gin Gly Val Gly Leu Ser Asp Lys Glu Leu Val Ser Asp Asp 
1330 1335 134Q 

Glu Glu Arg Gly Thr Gly Leu Glu Glu Asn Asn Gin Glu Glu Gin Ser 
1345 

Met Asp Ser Asn Leu Gly Glu Ala Ala Ser Gly Cys Glu Ser Glu Thr 



1365 1370 



1375 



Ser Val Ser Glu Asp Cys Ser Gly Leu Ser Ser Gin Ser Asp lie Leu 



1380 1385 



1390 



Thr Thr Gin Gin Arg Asp Thr Met Gin His Asn Leu lie Lys Leu Gin 
13 95 1400 140 5 



Gin Glu Met Ala Glu Leu Glu Ala Val Leu Glu Gin His Gly 
1410 i 415 

Pro Ser Asn Ser Tyr Pro Ser He II 
L425 1430 

Asp Leu Arg Asn Pro Glu Gin Ser Thr Ser Glu Lys Ala Val Leu Thr 



Ser Gin 

1420 

e Ser Asp Ser Ser Ala Leu Glu 

1440 



1445 1450 



14S5 



Ser Gin Lys Ser Ser Glu Tyr Pro He Ser Gin Asn Pro Glu Gly Leu 
1460 1465 1470 

Ser Ala Asp Lys Phe Glu Val Ser Ala Asp Ser Ser Thr Ser Lys Asn 
1475 1480 1485 
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Lys Glu Pro Glv Val ri„ * 

y «„ ^ s „ Set pro ^ ^ s ^ ^ ^ 

Asp Asp Arg Trp Tyr m»i- it,- «, 1500 

Met Hl3 Ser cys Sat oiy sec ^ 

1515 ■ M 

"a oiu oiu i,eu n. Lys Val Val Asp v>i J*' 

1530 1535 
Gin Gin Leu Glu Glu Ser Gly Pro Ri , . ■ r 

1540 7 ° *" ASP LeU Thr 61u Thr Ser Tyr 

1550 

- , la isp l>u Glu ^ ^ ^ sk 

1560 1S65 

Ser Leu Phe Ser Asp Asn p™ rM „ 

Pro Clu ser Asp Pr o Ser G lu A S p ^ Ala 

3 1580 

- «« ser Al a Ar g ^ Gly Asn „. ^ ^ ^ ^ ^ ^ 

1595 

Pr ° Gln L - Val Ala Glu Ser Ala Gin M o 

1605 , !f ' Gln Gly Pro Ala Ala 

1610 1615 
- HU Thr ^ ^ A1 , ^ ^ ^ s ^ ^ ^ 

1525 1630 
- — Pro olu le „ ^ Sk ^ ^ ^ ^ 

1645 



Arg Met Ser Met Val Val Qor m r 

1650 Y ^ Pr ° G1U Glu »» Met Leu 

1655 1660 

- ^ ^ . Lys His His IU Tht Leu ^ im 

1675 „ on 

1680 

- Olu Glu T hr Thr His Val Val Met ^ ^ ^ 

1690 1695 

Cys Glu Arg Thr Leu Lys Tyr Phe Leu Gly lie Ala Glv ri 

1700 i, rtB 7 Gly Lys T rp 

1705 1710 
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VI Val-Ser Tyr Phe Trp Val Thr GlrTSer lie Lys Glu Arg Lys Met 
1715 1720 , 1?25 

Leu Asn Glu His Asp Phe Glu Val Arg Gly Asp Val Val Asn Gly Arg 
1730 17 3S 174 0 



Asn His Gin Gly Pro Lys Arg Ala Arg Glu Ser Gin Asp Arg Lys lie 

176 

Phe Arg Gly Leu Glu lie Cys Cys Tyr Gly Pro Phe Thr Asn Met Pro 



1765 "70 1775 



Thr Asp Gin Leu Glu Trp Met Val Gin Leu Cys Gly Ala Ser Val Val 



1780 1785 



1790 



Lys Glu Leu Ser Ser Phe Thr Leu Gly Thr Gly Val His Pro He Val 

1795 1800 1805 

Val Val Gin Pro Asp Ala Trp Thr Glu Asp Asn Gly Phe 



1810 1815 



His Ala He 
1820 



Gly Gin Met Cys Glu Ala Pro Val Val Thr Arg Glu Trp Val Leu Asp 

1830 1«5 



Ser Val Ala Leu Tyr Gin Cys Gin Glu Leu Asp Thr Tyr Leu II 



1845 1850 



e Pro 
1855 



Gin He Pro His Ser His Tyr 
1860 



INFORMATION FOR SEQ ID NO : 3 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5711 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE : 

(A) ORGANISM: Homo sapiens 

(B) STRAIN: BRCA1 

(viii) POSITION IN GENOME: 

(A) CHROMOSOME /SEGMENT: 17 

(B) MAP POSITION: 17q21 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

AGCTCGCTGA GACTTCCTGG ACCCCGCACC AGGCTGTGGG GTTTCTCAGA TAACTGGGCC 60 

CCTGCGCTCA GGAGGCCTTC ACCCTCTGCT CTGGGTAAAG TTCATTGGAA CAGAAAGAAA 120 

TGGATTTATC TGCTCTTCGC GTTGAAGAAG TACAAAATGT CATTAATGCT ATGCAGAAAA 180 

TCTTAGAGTG TCCCATCTGT CTGGAGTTGA TCAAGGAACC TGTCTCCACA AAGTGTGACC 240 

ACATATTTTG CAAATTTTGC ATGCTGAAAC TTCTCAACCA GAAGAAAGGG CCTTCACAGT 3 00 

GTCCTTTATG TAAGAATGAT ATAACCAAAA GGAGCCTACA AGAAAGTACG AGATTTAGTC 3 60 

AACTTGTTGA AGAGCTATTG AAAATCATTT GTGCTTTTCA GCTTGACACA GGTTTGGAGT 420 

ATGCAAACAG CTATAATTTT GCAAAAAAGG AAAATAACTC TCCTGAACAT ' CTAAAAGATG . 480 

AAGTTTCTAT CATCCAAAGT ATGGGCTACA GAAACCGTGC CAAAAGACTT CTACAGAGTG 540 

AACCCGAAAA TCCTTCCTTG CAGGAAACCA GTCTCAGTGT CCAACTCTCT AACCTTGGAA 600 

CTGTGAGAAC TCTGAGGACA AAGCAGCGGA TACAACCTCA AAAGACGTCT GTCTACATTG 660 

AATTGGGATC TGATTCTTCT GAAGATACCG TTAATAAGGC AACTTATTGC AGTGTGGGAG " 720 

ATCAAGAATT GTTACAAATC ACCCCTCAAG GAACCAGGGA TGAAATCAGT TTGGATTCTG 780 
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CAAAAAAGGC TGCTTGTGAA TTTTCTGAGA CGGATGTAAC AAATACTGAA CATCATCAAC 840 

CCAGTAATAA TGATTTGAAC ACCACTGAGA AGCGTGCAGC TGAGAGGCAT CCAGAAAAGT 900 

ATCAGGGTAG TTCTGTTTCA AACTTGCATG TGGAGCCATG TGGCACAAAT ACTCATGCCA 96 Q 

GCTCATTACA GCATGAGAAC AGCAGTTTAT TACTCACTAA AGACAGAATG AATGTAGAAA 1020 

AGGCTGAATT CTGTAATAAA AGCAAACAGC CTGGCTTAGC AAGGAGCCAA CATAACAGAT 1080 

GGGCTGGAAG TAAGGAAACA TGTAATGATA GGCGGACTCC CAGCACAGAA AAAAAGGTAG 1140 

ATCTGAATGC TGATCCCCTG TGTGAGAGAA AAGAATGGAA TAAGCAGAAA CTGCCATGCT 1200 

CAGAGAATCC TAGAGATACT GAAGATGTTC CTTGGATAAC ACTAAATAGC AGCATTCAGA 1260 

AAGTTAATGA GTGGTTTTCC AGAAGTGATG AACTGTTAGG TTCTGATGAC TCACATGATG 13 20 

GGGAGTCTGA ATCAAATGCC AAAGTAGCTG ATGTATTGGA CGTTCTAAAT GAGGTAGATG 13 80 

AATATTCTGG TTCTTCAGAG AAAATAGACT TACTGGCCAG TGATCCTCAT GAGGCTTTAA 1440 

TATGTAAAAG TGAAAGAGTT CACTCCAAAT CAGTAGAGAG TAATATTGAA GACAAAATAT 1500 

TTGGGAAAAC CTATCGGAAG AAGGCAAGCC TCCCCAACTT AAGCCATGTA ACTGAAAATC 1560 

TAATTATAGG AGCATTTGTT ACTGAGCCAC AGATAATACA AGAGCGTCCC CTCACAAATA 1620 
AATTAAAGCG TAAAAGGAGA CCTACATCAG GCCTTCATCC TGAGGATTTT ATCAAGAAAG . 1680 

CAGATTTGGC AGTTCAAAAG ACTCCTGAAA TGATAAATCA GGGAACTAAC CAAACGGAGC 1740 

AGAATGGTCA AGTG ATGAAT ATTACTAATA GTGGTCATGA GAATAAAACA AAAGGTGATf 1800 

CTATTCAGAA TGAGAAAAAT CCTAACCCAA TAGAATCACT CG AAAAAGAA TCTGCTTTCA 1860 

AAACGAAAGC TGAACCTATA AGCAGCAGTA TAAGCAATAT GGAACTCGAA TTAAATATCC 1920 

ACAATTCAAA AGCACCTAAA AAGAATAGGC TGAGGAGGAA GTCTTCTACC AGGCATATTC 19 80 
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ATGCGCTTGA ACTAGTAGTC AGTAGAAATC . TAAGCCCACC TAATTGTACT GAATTGCAAA 20 < 

TTGATAGTTG TTCTAGCAGT GAAGAGATAA AGAAAAAAAA GTACAACCAA ATGCCAGTCA 21C 

GGCACAGCAG AAACCTACAA CTCATGGAAG GTAAAGAACC TGCAACTGGA GCCAAGAAGA 216 

GTAACAAGCC AAATGAACAG ACAAGTAAAA GACATGACAG CGATACTTTC CCAGAGCTGA 222 

AGTTAACAAA TGCACCTGGT TCTTTTACTA AGTGTTCAAA TACCAGTGAA CTTAAAGAAT 223 

TTGTCAATCC TAGCCTTCCA AGAGAAGAAA AAGAAGAGAA ACTAGAAACA GTTAAAGTGT 2 34 

CTAATAATGC TGAAGACCCC AAAGATCTCA TGTTAAGTGG AGAAAGGGTT TTGCAAACTG 240 

AAAGATCTGT AGAGAGTAGC AGTATTTCAT TGGTACCTGG TACTGATTAT GGCACTCAGG 24S- 

AAAGTATCTC GTTACTGGAA GTTAGCACTC TAGGGAAGGC AAAAACAGAA CCAAATAAAT 2 52 I 

GTGTGAGTCA GTGTGCAGCA TTTGAAAACC CCAAGGGACT AATTCATGGT TGTTCCAAAG 25 8C 

ATAATAGAAA TGACACAGAA GGCTTTAAGT ATCCATTGGG ACATGAAGTT AACCACAGTC 264C 

GGGAAACAAG CATAGAAATG GAAGAAAGTG AACTTGATGC TCAGTATTTG CAGAATACAT 27 QC 

TCAAGGTTTC AAAGCGCCAG TCATTTGCTC TGTTTTCAAA TCCAGGAAAT GCAGAAGAGG 27 6G 

AATGtGCAAC ATTCTCTGCC CACTCTGGGT CCTTAAAGAA ACAAAGTCCA AAAGTCACTT 282Q 

TTGAATGTGA ACAAAAGGAA GAAAATCAAG GAAAGAATGA GTCTAATATC AAGCCTGTAC 288Q 

AGACAGTTAA TATCACTGCA GGCTTTCCTG TGGTTGGTCA GAAAGATAAG CCAGTTGATA 29 4Q 

ATGCCAAATG TAGTATCAAA GGAGGCTCTA GGTTTTGTCT ATCATCTCAG TTCAGAGGCA 3 00Q 

ACGAAACTGG ACTCATTACT CCAAATAAAC ATGGACTTTT ACAAAACCCA TATCGTATAC 30 60 

CACCACTTTT TCCCATCAAG TCATTTGTTA AAACTAAATG TAAGAAAAAT CTGCTAGAGG 3120 

AAAACTTTGA GGAACATTCA ATGTCACCTG AAAGAGAAAT GGGAAATGAG AACATTCCAA 3180 
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GTACAGTGAG CACAATTAGC CGTAATAACA TTAGAGAAAA TGTTTTTAAA GAAGCCAGCT 324 

CAAGCAATAT TAATGAAGTA GGTTCCAGTA CTAATGAAGT GGGCTCCAGT ATTAATGAAA 3 30t 

TAGGTTCCAG TGATGAAAAC ATTCAAGCAG AACTAGGTAG AAACAGAGGG CCAAAATTGA 3 3 6( 

ATGCTATGCT TAGATTAGGG GTTTTGCAAC CTGAGGTCTA TAAACAAAGT CTTCCTGGAA 342C 

GTAATTGTAA GCATCCTGAA ATAAAAAAGC AAGAATATGA AGAAGTAGTT CAGACTGTTA 3 48C 

ATACAGATTT CTCTCCATAT CTGATTTCAG ATAACTTAGA ACAGCCTATG GGAAGTAGTC 3 540 

ATGCATCTCA GGTTTGTTCT GAGACACCTG ATGACCTGTT AGATGATGGT GAAATAAAGG 3 600 

AAGATACTAG TTTTGCTGAA AATGACATTA AGGAAAGTTC TGCTGTTTTT AGCAAAAGCG 3660 

TCCAGAAAGG AGAGCTTAGC AGGAGTCCTA GCCCTTTCAC CCATACACAT TTGGCTCAGG 3720 

GTTACCGAAG AGGGGCCAAG AAATTAGAGT CCTCAGAAGA GAACTTATCT AGTGAGGATG 3780 

AAGAGCTTCC CTGCTTCCAA CACTTGTTAT TTGGTAAAGT AAACAATATA CCTTCTCAGT 3*840 

CTACTAGGCA TAGCACCGTT GCTACCGAGT GTCTGTCTAA GAACACAGAG GAGAATTTAT 3900 

TATCATTGAA GAATAGCTTA AATGACTGCA GTAACCAGGT AATATTGGCA AAGGCATCTC 3960 

AGGAACATCA CCTTAGTGAG GAAACAAAAT GTTCTGCTAG CTTGTTTTCT TCACAGTGCA 4020 

GTGAATTGGA AGACTTGACT GCAAATACAA ACACCCAGGA TCCTTTCTTG' ATTGGTTCTT 40 80 

CCAAACAAAT GAGGCATCAG TCTGAAAGCC AGGGAGTTGG TCTGAGTGAC AAGGAATTGG 4140 

TTTCAGATGA TGAAGAAAGA GGAACGGGCT TGGAAGAAAA TAATCAAGAA GAGCAAAGCA 4200 

TGGATTCAAA CTTAGGTGAA GCAGCATCTG GGTGTGAGAG TGAAACAAGC GTCTCTGAAG 4260 

ACTGCTCAGG GCTATCCTCT CAGAGTGACA TTTTAACCAC TCAGCAGAGG GATACCATGC 43 20 

AACATAACCT GATAAAGCTC CAGCAGGAAA TGGCTGAACT AGAAGCTGTG TTAGAACAGC 43 80 
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ATGGGAGCCA GCCTTCTAAC AGCTACCCTT CCATCATAAG TGACTCTTCT GCCCTTGAGG 44 

ACCTGCGAAA TCCAGAACAA AGCACATCAG AAAAAGCAGT ATTAACTTCA CAGAAAAGTA 45( 

GTGAATACCC TATAAGCCAG AATCCAGAAG GCCTTTCTGC TGACAAGTTT GAGGTGTCTG 456 

CAGATAGTTC TACCAGTAAA AATAAAGAAC CAGGAGTGGA AAGGTCATCC CCTTCTAAAT 4 62 

GCCCATCATT AGATGATAGG TGGTACATGC ACAGTTGCTC TGGGAGTCTT CAGAATAGAA 468 

ACTACCCATC TCAAGAGGAG CTCATTAAGG TTGTTGATGT GGAGGAGCAA CAGCTGGAAG 474 

AGTCTGGGCC ACACGATTTG ACGGAAACAT CTTACTTGCC AAGGCAAGAT CTAGAGGGAA 480i 

CCCCTTACCT GGAATCTGGA ATCAGCCTCT TCTCTGATGA CCCTGAATCT GATCCTTCTG 486( 

AAGACAGAGC CCCAGAGTCA GCTCGTGTTG GCAACATACC ATCTTCAACC TCTGCATTGA 492C 

AAGTTCCCCA ATTGAAAGTT GCAGAATCTG CCCAGAGTCC AGCTGCTGCT CATACTACTG 4980 

ATACTGCTGG GTATAATGCA ATGGAAGAAA GTGTGAGCAG GGAGAAGCCA GAATTGACAG 5040 

CTTCAACAGA AAGGGTCAAC AAAAGAATGT CCATGGTGGT GTCTGGCCTG ACCCCAGAAG 5100 

AATTTATGCT CGTGTACAAG TTTGCCAGAA AACACCACAT .CACTTTAACT AATCTAATTA 5160 

CTGAAGAGAC TACTCATGTT GTTATGAAAA CAGATGCTGA GTTTGTGTGT GAACGGACAC 52 20 

TGAAATATTT TCTAGGAATT GCGGGAGGAA, AATGGGTAGT TAGCTATTTC TGGGTGACCC 5280 

AGTCTATTAA AGAAAGAAAA ATGCTGAATG AGCATGATTT TGAAGTCAGA GGAGATGTGG 5340 

TCAATGGAAG AAACCACCAA GGTCCAAAGC GAGCAAGAGA ATCCCAGGAC AGAAAGATCT 5400 

TCAGGGGGCT AGAAATCTGT TGCTATGGGC CCTTCACCAA CATGCCCACA GATCAACTGG 54 60 

AATGGATGGT ACAGCTGTGT GGTGCTTCTG TGGTGAAGGA GCTTTCATCA TTCACCCTTG '5520 

GCACAGGTGT CCACCCAATT GTGGTTGTGC AGCCAGATGC CTGGACAGAG GACAATGGCT 5580 

63 



U ! *an J^NEai ^ c ^ iu ^ :£» ilJi C 



TCCATGCAAT TGGGCAGATG TGTGAGGCAC CTGTGGTGAC CCGAGAGTGG GTGTTGGACA 
GTGTAGCACT CTACCAGTGC CAGGAGCTGG ACACCTACCT GATACCCCAG ATCCCCCACA 
GCCACTACTG A 

(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1863 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(B) STRAIN: BRCA1 

(viii) POSITION IN GENOME: 

(A) CHROMOSOME /SEGMENT: 17 

(B) MAP POSITION: 17q21 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

Met Asp Leu Ser Ala Leu Arg Val Glu Glu Val Gin Asn Val u. A sn 
5 10 15 



Ala Met Gin Lys He Leu Glu Cys Pro lie Cys Leu Glu Leu ll e Lys 
20 25 3 0 

Glu Pro Val Ser Thr Lys Cys Asp His lie Phe Cys Lys Phe Cys Met 

35 40 45 

Leu Lys Leu Leu Asn Gin Lys Lys Gly Pro Ser Gin Cys Pro Leu Cys 



5640 
5700 
5711 
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Lys Asn Asp He Thr Lys Arg Ser Leu- Gin Glu Ser Thr Arg Phe Ser 

s5 70 75 

/3 80 

Gin Leu Val Glu Glu Leu Leu Lys lie He Cys Ala Phe Gin Leu Asp 

85 90 95 

Thr Gly Leu Glu Tyr Ala Asn Ser Tyr Asn Phe Ala Lys Lys Glu Asn 



100 



105 



110 



Asn Ser Pro Glu His Leu Lys Asp Glu Val Ser He II 



115 



120 



e Gin Ser Met 



125 



Gly Tyr Arg Asn Arg Ala Lys Arg Leu Leu Gin Ser Glu Pro Glu Asn 



130 



135 



140 



Pro ser Leu Gin Glu Thr Ser Leu Ser Val Gin Leu Ser Asn Leu Gly 
145 150 155 



Thr Val Arg Thr Leu Arg Thr Lys Gin Arg lie Gin Pro Gin 



165 



170 



160 

Lys Thr 
175 



Ser Val Tyr He Glu Leu Gly Ser Asp Ser Ser Glu Asp Thr Val Asn 
180 185 



190 



Lys Ala Thr Tyr Cys Ser Val Gly Asp Gin Glu Leu Leu Gin II 



195 



s Thr 



200 



205 



Pro Gin Gly Thr Arg Asp Glu lie Ser Leu Asp Ser Ala Lys Lys Ale 
210 215 



220 



Ala Cys Glu Phe Ser Glu Thr Asp Val Thr Asn Thr Glu His His Gin 



225 



230 



235 



240 



Pro Ser Asn Asn Asp Leu Asn Thr Thr Glu Lys Arg Ala Ala Glu Arg 



245 



250 



255 



His Pro Glu Lys Tyr Gin Gly Ser Ser Val Ser Asn Leu His Val Glu 
260 265 270 
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280 * 285 
^ b 300 

315 320 

Trp Ma oiy Ser t y , «. Thr ^ Asn Asp ^ ^ ^ ^ ^ ^ 

jjo 335 

«- Lys Val Asp leu Asn A1 . Asp p „ ^ ^ ^ 



345 



350 



Trp Asn Lys Gin Lys Leu Pro Cys Ser 
355 360 



Glu Asn Pro Arg Asp Thr Glu 



365 



Asp Val Pro Trp He Thr Leu Asn Ser Ser . lie Gin Lys Val 
370 375 



Asn Glu 



380 



Trp Phe Ser Arg Ser Asp Glu Leu Leu Gly 
385 390 

Gly Glu Ser Glu Ser Asn Ala Lys Val Ala Asn r » 

y vaj " Aia Asp Val Leu Asp Val Leu 

405 410 415 

Asn Glu Val Asp Glu Tyr Ser Gly Ser Ser Glu Lys He Asp Leu Leu 



Ser Asp Asp Ser His Asp 
395 400 



425 



430 



Ala ser u, P „ His alu AU ^ ^ ^ ^ ^ ^ 
" "0 445 



455 460 
Tyr Ar, Lys Ly , Ala s „ ^ pro Asn ^ ^ 



470 



475 



480 



Leu He He Gly Ala Phe Val Thr Glu Pro Gin lie He 



Gin Glu Arg 
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485 



490 



495 

Pro Leu Thr Asn Lys Leu Lys Arg Lys Arg Arg Pro xhr Ser 
500 5 10 
His Pro Glu Asp Pha He Lys Lys Ala Asp Leu M , Val ^ ^ 
515 5 20 525 

Pro Glu Met lie Asn Gin Gly Thr Asn Gin Thr Glu Gin Asn Gly Gin 
530 535 540 

Val Met Asn lie Thr Asn Ser Gly His Glu Asn Lys. Thr Lys Gly Asp. 



545 "0 555 



560 



Ser lie Gin. Asn Glu Lys Asn Pro Asn Pro lie Glu Ser Leu Glu Lys 
565 570 575 

Glu Ser Ala Phe Lys Thr Lys Ala Glu Pro lie Ser Ser Ser lie- Ser 



580 



585 



590 



Asn Met Glu Leu Glu Leu Asn lie His Asn Ser Lys Ala Pro Lys Lys 
595 600 605 

Asn Arg Leu Arg Arg Lys Ser Ser Thr Arg His lie His Ala 

610 615" 620 . 

Leu Val Val Ser Arg Asn Leu Ser Pro Pro Asn Cys Thr Glu Leu Gin 

525 630 fi -»c 

635 640 

He Asp Ser Cys Ser Ser Ser Glu Glu lie Lys Lys Lys Lys Tyr Asn 
645 . "0 655 

Gin Met Pro Val Arg His Ser Arg Asn Leu Gin Leu Met Glu Gly Lys 



Leu Glu 
620 . 



660 665 



670 



Glu Pro Ala Thr Gly Ala Lys Lys Ser Asn Lys Pro Asn Glu Gin Thr 

675 680 685 

Ser Lys Arg His Asp Ser Asp Thr Phe Pro Glu Leu Lys Leu Thr Asn 
690 i 695 700 
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Ala Pro Gly Ser Phe Thr Lys Cys Ser Asn Thr Ser Glu Leu Lys Glu 
705 710 - 715 720 

Phe Val. Asn Pro Ser. Leu Pro Arg Glu Glu Lys Glu Glu Lys Leu Glu 
7 25 730 735 

Thr Val Lys Val Ser Asn Asn Ala Glu Asp Pro Lys Asp Leu Met Leu 
740 745 750 

Ser Gly Glu Arg Val Leu Gin Thr Glu Arg Ser Val Glu Ser Ser Ser 
7SS 760 765 

lie Ser Leu Val Pro Gly Thr Asp Tyr Gly Thr Gin Glu Ser lie Ser 
770 775 780 



Leu Leu Glu Val Ser Thr Leu Gly Lys Ala Lys Thr Glu Pro Asn Lys 
785 790 



795 



800 



Cys Val Ser Gin Cys Ala Ala Phe Glu Asn Pro Lys Gly Leu He His 

805 810 815 

Gly. Cys Ser Lys Asp Asn Arg Asn Asp Thr Glu Gly Phe Lys Tyr Pro 

820 825 830 

Leu Gly His Glu Val Asn His Ser Arg Glu Thr Ser He Glu Met Glu 

835 840 845 



Glu Ser Glu Leu Asp Ala Gin Tyr Leu Gin Asn Thr Phe Lys Val Ser 
850 855 



860 



Lys Arg Gin Ser Phe Ala Leu Phe Ser Asn Pro Gly Asn Ala Glu Glu 
865 870 875 880 



Glu Cys Ala Thr Phe Ser Ala His Ser Gly Ser Leu Lys Lys Gin Ser 

885 890 895 

Pro Lys Val Thr Phe Glu Cys Glu Gin Lys Glu Glu Asn Gin Gly Lys 

900 905 910 



Asn Glu Ser Asn He Lys Pro Val Gin Thr Val Asn He Thr Ala Gly 
915 920 925 ' 
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Phe Pro. Val Val Gly Gin Lys Asp Lys- Pro Val Asp Asn Ala Lys Cy= 
930 935 940 



Ser He Lys Gly Gly Ser Arg Phe Cys 
945 950 955 



Leu Ser Ser Gin Phe Arg Gly 

960 



Asn Glu Thr Gly Leu lie Thr Pro Asn Lys His Gly Leu Leu Gin Asn 

975 



965 970 



Pro Tyr Arg He Pro Pro Leu Phe Pro lie Lys Ser Phe Val Lys Thr 
980 gss 99Q 

Lys Cys Lys Lys Asn Leu Leu Glu Glu Asn Phe Glu Glu His Ser Met 
995 1000 100 5 

Ser Pro Glu Arg Glu Met Gly Asn Glu Asn lie Pro Ser Thr Val Ser 
1010 1015 1020- 



Thr He Ser Arg Asn Asn He Arg Glu Asn Val Phe Lys Glu Ala Ser 

1040 



1Q 25 1030 1035 



Ser Ser Asn He Asn Glu Val Gly Ser Ser Thr Asn Glu Val Gly Ser 
1045 1050 1055 

Ser He. Asn Glu lie Gly Ser Ser Asp Glu Asn He Gin Ala Glu Leu 
1060 1065 1070 

Gly Arg Asn Arg Gly Pro Lys Leu Asn Ala Met Leu Arg Leu Gly Val 
1075 1080. 1085 

Leu Gin Pro Glu Val Tyr Lys Gin Ser Leu Pro Gly Ser Asn Cys Lys 
1°90 1095 noo 

His Pro Glu He Lys Lys Gin Glu Tyr Glu Glu Val Val Gin Thr Val 

1105 1110 i 1 1 c 

A - Liu . 1H5 H20 

Asn Thr Asp Phe Ser Pro Tyr Leu He Ser Asp Asn Leu Glu Gin Pro 
1125 H30 H35 
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Met Gi y Ser Ser His Ala Ser Gin Val Cys Ser Glu Thr Pro Asp Asp 
1140 I"* 1150 

Leu Leu Asp Asp Gly Glu lie Lys Glu Asp Thr Ser Phe Ala Glu Asn 
1155 1"0 1165 

Asp He Lys Glu Ser Ser Ala Val Phe Ser Lys Ser Val Gin Lys Gly 



1170 . 1175 1180 



Glu Leu Ser Arg Ser Pro Ser Pro Phe Thr His Thr His Leu Ala Gin 
1185 U90 1195 

Gly Tyr Arg Arg Gly Ala Lys Lys Leu Glu Ser Ser Glu Glu Asn Le'u 



1205 1210 



Ser Ser Glu Asp Glu Glu Leu Pro Cys Phe Gin His Leu Leu Phe Gly 
1220 1225 1230 

Lys Val Asn Asn lie Pro Ser Gin Ser Thr Arg His Ser Thr Val Ala 



1235 1240 



1245 



Thr Glu Cys Leu Ser Lys Asn Thr Glu Glu Asn Leu Leu Ser Leu Lys 



1250 1255 



1260 



Asn Ser Leu Asn Asp Cys Ser Asn Gin Val He Leu Ala 
1265 1270 12?5 



Lys Ala Ser 
1280 



Gin Glu His His Leu Ser Glu Glu Thr Lys Cys Ser Ala Ser Leu Phe 
1285 1290 1295 

Ser Ser Gin Cys Ser Glu Leu Glu Asp Leu Thr Ala Asn Thr Asn Thr 
"00. • 1305 131Q 

Gin Asp Pro Phe Leu lie Gly Ser Ser Lys Gin Met Arg His Gin Ser 



1315 



1320 



1325 



Glu Ser Gin Gly Val Gly Leu Ser Asp Lys Glu Leu Val Ser Asp Asp 
1330 1335 134() . 



Glu Glu Arg Gly Thr Gly Leu Glu Glu Asn. Asn Gin Glu Glu Gin 
1345 1350 1355 



Ser 

136.0 
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Met Asp.ser Asn Leu Gly Glu Ala Ala- Ser Gly Cys Glu Ser Glu Thr 
1365 . 1370 13?5 • 

Ser Val Ser Glu Asp Cys Ser Gly Leu Ser Ser Gin Ser Asp U e Leu 
1380 "SB- 139Q 

Thr Thr Gin Gin Arg Asp Thr Met Gin His Asn Leu He Lys Leu Gin 



1395 



1400 



1405 



Gin Glu Met Ala Glu Leu Glu Ala Val Leu Glu Gin His Gly Ser Gin 



1410 



1415 



1420 



Pro Ser Asn Ser Tyr Pro Ser He lie Ser Asp Ser Ser Ala Leu Glu 
1425 1430 1435 1440 

Asp Leu Arg Asn Pro Glu Gin Ser Thr Ser Glu Lys Ala Val Leu Thr 

1445 1450 1455 

Ser Gin. Lys Ser Ser Glu Tyr Pro lie Ser Gin Asn Pro Glu Gly Leu 



1460 



1465 



1470 



Ser Ala Asp Lys Phe Glu Val Ser Ala Asp Ser Ser Thr Ser Lys Asn 
1475 1480 1485 

Lys Glu Pro Gly Val Glu Arg Ser Ser Pro Ser Lys Cys Pro Ser Leu 
1490 1495 150 o 



Asp Asp Arg Trp Tyr Met. His Ser Cys Ser Gly Ser Leu Gin Asn Arg 
1505 • 1510 1515 1S2 

Asn Tyr Pro Ser Gin Glu Glu Leu lie Lys Val Val Asp Val Glu Glu 



1525 1530 



1535 



Gin Gin Leu Glu Glu Ser Gly Pro His Asp Leu Thr Glu Thr Ser Tyr 
1540 1545 1550 

Leu Pro Arg Gin Asp Leu Glu Gly Thr Pro Tyr Leu Glu Ser Gly He 
1555 1560 1565 

Ser Leu Phe Ser Asp Asp Pro Glu Ser Asp Pro Ser Glu Asp Arg Ala 
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1570 1575 1580 

, Pro Glu ser Ala Arg Val Gly Asn He Pro Ser Ser Thr Ser Ala Leu 

1585 159 ° 1595 

1595 1600 

Lys Val Pro Gin Leu Lys Val Ala Glu Ser Ala Gin Ser Pro Ala Ala 

1605 isio 16l5 

Ala His Thr Thr Asp Thr Ala Gly Tyr Asn Ala Met Glu Glu Ser Val 
1620 1625 1630 

Ser Arg Glu Lys Pro Glu Leu Thr Ala Ser Thr Glu Arg Val Asn Lys 
1635 1640 164 5 

Arg Met Ser Met Val Val Ser Gly Leu Thr Pro Glu Glu Phe Met Leu 
1650 1655 1660 

Val Tyr Lys Phe Ala Arg Lys His His lie Thr Leu Thr Asn Leu lie 
1665 1670 1675 1680 

Thr Glu Glu Thr Thr His Val Val Met Lys Thr Asp Ala . Glu Phe Val 
1685 1690 16g5 

Cys Glu Arg Thr Leu Lys Tyr Phe Leu Gly lie Ala Gly Gly Lys Trp 
■ 1700 1705 1710 

Val Val Ser Tyr Phe Trp Val Thr Gin Ser He Lys Glu Arg Lys Met 
1715 1720 1725 

Leu Asn Glu His Asp Phe Glu Val Arg Gly Asp Val Val Asn Gly Arg 
1730 1735 174 0 

Asn His Gin Gly Pro Lys Arg Ala Arg Glu Ser Gin Asp Arg Lys lie 
1745 1750 1755 1760 

Phe Arg Gly Leu Glu lie Cys Cys Tyr Gly Pro Phe Thr Asn Met Pro 
1765 ■ 1770 1775 

Thr Asp Gin Leu Glu Trp Met Val Gin Leu Cys Gly. Ala Ser Val Val 
1780 i 785 I79Q 
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Lys Glu.Leu Ser Ser Phe Thr Leu Gly Thr Gly Val His Pro lie Val 

1795 1800 1805 

Val val Gin Pro Asp Ala Trp Thr Glu A sp Asn Gly Phe His Ala lie 
1810 "15 1820 

Gly Gin Me t Cys Glu Ala Pro Val Val Thr Arg Glu Trp Val Leu Asp 

1830 "35 1840 

Ser Val Ala Leu Tyr Gin Cys Gin Glu Leu Asp Thr Tyr Leu U e Pro 
1845 "50 1855 . 

Gin lie Pro His Ser His Tyr 
1860 



(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5711 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(B) STRAIN: BRCA1 

(viii) POSITION IN GENOME: 

( A ) CHROMOSOME / SEGMENT : 1 7 

(B) MAP POSITION: 17q21 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 5 
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AGCTCGCTGA GACTTCCTGG ACCCCGCACC AGGCTGTGGG GTTTCTCAGA TAACTGGGCC 60 

CCTGCGCTCA GGAGGCCTTC ACCCTCTGCT CTGGGTAAAG TTCATTGGAA CAGAAAGAAA 120 

TGGATTTATC TGCTCTTCGC GTTGAAGAAG TACAAAATGT CATTAATGCT ATGCAGAAAA 180 

TCTTAGAGTG TCCCATCTGT CTGGAGTTGA TCAAGGAACC TGTCTCCACA AAGTGTGACC 240 

ACATATTTTG CAAATTTTGC ATGCTGAAAC TTCTCAACCA GAAGAAAGGG CCTTCACAGT 300 

GTCCTTTATG TAAGAATGAT ATAACCAAAA GGAGCCTACA AGAAAG'TACG AGATTTAGTC 3 60 

AACTTGTTGA AGAGCTATTG AAAATCATTT GTGCTTTTCA GCTTGACACA GGTTTGGAGT 420 

ATGCAAACAG CTATAATTTT GCAAAAAAGG AAAATAACTC TCCTGAACAT CTAAAAGATG 480 

AAGTTTCTAT CATCCAAAGT ATGGGCTACA GAAACCGTGC CAAAAGACTT CTACAGAGTG 540 

AACCCGAAAA TCCTTCCTTG CAGGAAACCA GTCTCAGTGT CCAACTCTCT AACCTTGGAA 600 

CTGTGAGAAC TCTGAGGACA AAGCAGCGGA TACAACCTCA AAAGACGTCT GTCTACATTG .660 

AATTGGGATC TGATTCTTCT GAAGATACCG TTAATAAGGC AACTTATTGC AGTGTGGGAG 720 

ATCAAGAATT GTTACAAATC ACCCCTCAAG GAACCAGGGA TGAAATCAGT TTGGATTCTG 7 80 

CAAAAAAGGC TGCTTGTGAA TTTTCTGAGA CGGATGTAAC AAATACTGAA CATCATCAAC 840 

CCAGTAATAA TGATTTGAAC ACCACTGAGA AGCGTGCAGC TGAGAGGCAT CCAGAAAAGT 900 

ATCAGGGTAG TTCTGTTTCA AACTTGCATG TGGAGCCATG TGGCACAAAT ACTCATGCCA 9 60 

GCTCATTACA GCATGAGAAC AGCAGTTTAT TACTCACTAA AGACAGAATG AATGTAGAAA 1020 

AGGCTGAATT CTGTAATAAA AGCAAACAGC CTGGCTTAGC AAGGAGCCAA CATAACAGAT 10 80 

GGGCTGGAAG TAAGGAAACA TGTAATGATA GGCGGACTCC CAGCACAGAA AAAAAGGTAG 1140 

ATCTGAATGC TGATCCCCTG TGTGAGAGAA AAGAATGGAA TAAGCAGAAA CTGCCATGCT 1200 
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CAGAGAATCC TAGAGATACT GAAGATGTTC 
AAGTTAATGA GTGGTTTTCC AGAAGTGATG 
GGGAGTCTGA ATCAAATGCC AAAGTAGCTG 
AATATTCTGG TTCTTCAGAG AAAATAGACT 
TATGTAAAAG TGAAAGAGTT CACTCCAAAT 
TTGGGAAAAC CTATCGGAAG AAGGCAAGCC 
TAATTATAGG AGCATTTGTT ACTGAGCCAC 
AATTAAAGCG TAAAAGGAGA CCTACATCAG 
CAGATTTGGC AGTTCAAAAG ACTCCTGAAA 
AGAATGGTCA AGTGATGAAT ATTACTAATA 
CTATTCAGAA TGAGAAAAAT CCTAACCCAA 
AAACGAAAGC TGAACCTATA AGCAGCAGTA 
ACAATTCAAA AGCACCTAAA AAGAATAGGC 
ATGCGCTTGA ACTAGTAGTC AGTAGAAATC 
TTGATAGTTG TTCTAGCAGT GAAGAGATAA 
GGCACAGCAG AAACCTACAA CTCATGGAAG 
GTAACAAGCC AAATGAACAG ACAAGTAAAA 
AGTTAACAAA TGCACCTGGT TCTTTTACTA 
TTGTCAATCC TAGCCTTCCA AGAGAAGAAA 
CTAATAATGC TGAAGACCCC AAAGATCTCA 



CTTGGATAAC ACTAAATAGC AGCATTCAGA 
AACTGTTAGG TTCTGATGAC TCACATGATG 
ATGTATTGGA CGTTCTAAAT GAGGTAGATG 
TACTGGCCAG TGATCCTCAT GAGGCTTTAA 
CAGTAGAGAG- TAATATTGAA GACAAAATAT 
TCCCCAACTT AAGCCATGTA ACTGAAAATC 
AGATAATACA AGAGCGTCCC CTCACAAATA 
GCCTTCATCC TGAGGATTTT ATCAAGAAAG 
TGATAAATCA GGGAACTAAC CAAACGGAGC 
GTGGTCATGA GAATAAAACA AAAGGTGATT 
TAGAATCACT CGAAAAAGAA TCTGCTTTCA 
TAAGCAATAT GGAACTCGAA TTAAATATCC 
TGAGGAGGAA GTCTTCTACC AGGCATATTC 
TAAGCCCACC TAATTGTACT GAATTGCAAA 
AGAAAAAAAA GTACAACCAA ATGCCAGTCA 
GTAAAGAACC TGCAACTGGA GCCAAGAAGA 
GACATGACAG TGATACTTTC CCAGAGCTGA 
AGTGTTCAAA TACCAGTGAA CTTAAAGAAT 
AAGAAGAGAA ACTAGAAACA GTTAAAGTGT 
TGTTAAGTGG AGAAAGGGTT TTGCAAACTG 
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AAAGATCTGT AGAGAGTAGC AGTATTTCAC TGGTACCTGG TACTGATTAT GGCACTCAGG 24 60 

AAAGTATCTC GTTACTGGAA GTTAGCACTC TAGGGAAGGC AAAAACAGAA CCAAATAAAT 2 520 

GTGTGAGTCA GTGTGCAGCA TTTGAAAACC CCAAGGGACT AATTCATGGT TGTTCCAAAG " 2580 

ATAATAGAAA TGACACAGAA GGCTTTAAGT ATCCATTGGG ACATGAAGTT AACCACAGTC 2640 

GGGAAACAAG CATAGAAATG GAAGAAAGTG AACTTGATGC TCAGTATTTG CAGAATACAT 27 00 

TCAAGGTTTC AAAGCGCCAG TCATTTGCTC TGTTTTCAAA TCCAGGAAAT GCAGAAGAGG 2760 

AATGTGCAAC ATTCTCTGCC CACTCTGGGT CCTTAAAGAA ACAAAGTCCA AAAGTCACTT 2820 

TTGAATGTGA ACAAAAGGAA GAAAATCAAG GAAAGAATGA GTCTAATATC AAGCCTGTAC 2880 

AGACAGTTAA TATCACTGCA. GGCTTTCCTG TGGTTGGTCA GAAAGATAAG CCAGTTGATA 2940 

ATGCCAAATG TAGTATCAAA GGAGGCTCTA GGTTTTGTCT ATCATCTCAG TTCAGAGGCA 300.0 

ACGAAACTGG ACTCATTACT CCAAATAAAC ATGGACTTTT ACAAAACCCA TATCGTATAC 3 0 60 

CACCACTTTT TCCCATCAAG TCATTTGTTA AAACTAAATG TAAGAAAAAT CTGCTAGAGG 3120 

AAAACTTTGA GGAACATTCA ATGTCACCTG AAAGAGAAAT GGGAAATGAG AACATTCCAA 3180 

GTACAGTGAG CACAATTAGC CGTAATAACA TTAGAGAAAA TGTTTTTAAA GGAGCCAGCT 32 40 

CAAGCAATAT TAATGAAGTA GGTTCCAGTA CTAATGAAGT GGGCTCCAGT . ATTAATGAAA 33 00 

TAGGTTCCAG TGATGAAAAC ATTCAAGCAG AACTAGGTAG AAACAGAGGG CCAAAATTGA 3 3 60 

ATGCTATGCT TAGATTAGGG GTTTTGCAAC CTGAGGTCTA TAAACAAAGT CTTCCTGGAA 3420 

GTAATTGTAA GCATCCTGAA ATAAAAAAGC AAGAATATGA AGAAGTAGTT CAGACTGTTA 3480 
ATACAGATTT CTCTCCATAT CTGATTTCAG ATAACTTAGA ACAGCCTATG GGAAGTAGTC ■ 3 540 
ATGCATCTCA GGTTTGTTCT GAGACACCTG ATGACCTGTT AGATGATGGT GAAATAAAGG 3 600 
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AAGATACTAG TTTTGCTGAA AATGACATTA AGGAAAGTTC TGCTGTTTTT AGCAAAAGCG ,3 560 

TCCA.GAGAGG AGAGCTTAGC AGGAGTCCTA GCCCTTTCAC CCATACACAT TTGGCTCAGG .3720 

GTTACCGAAG AGGGGCCAAG AAATTAGAGT CCTCAGAAGA GAACTTATCT AGTGAGGATG 37 8 0 

AAGAGCTTCC CTGCTTCCAA CACTTGTTAT TTGGTAAAGT AAACAATATA CCTTCTCAGT 3840 

CTACTAGGCA TAGCACCGTT GCTACCGAGT GTCTGTCTAA GAACACAGAG GAGAATTTAT 3900 

TATCATTGAA GAATAGCTTA AATGACTGCA GTAACCAGGT AATATTGGCA AAGGCATCTC 3 960 

AGGAACATCA CCTTAGTGAG GAAACAAAAT GTTCTGCTAG CTTGTTTTCT TCACAGTGCA 4020 

GTGAATTGGA AGACTTGACT GCAAATACAA ACACCCAGGA TCCTTTCTTG ATTGGTTCTT 40 80 

CCAAACAAAT GAGGCATCAG TCTGAAAGCC AGGGAGTTGG TCTGAGTGAC AAGGAATTGG 4140 

TTTCAGATGA TGAAGAAAGA GGAACGGGCT TGGAAGAAAA TAATCAAGAA GAGCAAAGCA . 4200 

TGGATTCAAA CTTAGGTGAA GCAGCATCTG GGTGTGAGAG TGAAACAAGC GTCTCTGAAG 42 60. 

ACTGCTCAGG GCTATCCTCT CAGAGTGACA TTTTAACCAC TCAGCAGAGG GATACCATGC 4320 

AACATAACCT GATAAAGCTC CAGCAGGAAA TGGCTGAACT AGAAGCTGTG TTAGAACAGC 43 80 

ATGGGAGCCA GCCTTCTAAC AGCTACCCTT CCATCATAAG TGACTCTTCT GCCCTTGAGG 4440 
ACCTGCGAAA TCCAGAACAA AGCACATCAG AAAAAGCAGT ATTAACTTCA CAGAAAAGTA . 4500 

GTGAATACCC TATAAGCCAG AATCCAGAAG GCCTTTCTGC TGACAAGTTT GAGGTGTCTG 4560 

CAGATAGTTC TACCAGTAAA AATAAAGAAC CAGGAGTGGA AAGGTCATCC CCTTCTAAAT 4620 

GCCCATCATT AGATGATAGG TGGTACATGC ACAGTTGCTC TGGGAGTCTT CAGAATAGAA 4680 

ACTACCCATC TCAAGAGGAG CTCATTAAGG TTGTTGATGT GGAGGAGCAA CAGCTGGAAG 47 40 

AGTCTGGGCC ACACGATTTG ACGGAAACAT CTTACTTGCC AAGGCAAGAT CTAGAGGGAA 4300 
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CCCCTTACCT GGAATCTGGA ATCAGCCTCT TCTCTGATGA CCCTGAATCT GATCCTTCTG 
AAGACAGAGC CCCAGAGTCA GCTCGTGTTG GCAACATACC ATCTTCAACC TCTGCATTGA 
AAGTTCCCCA ATTGAAAGTT GCAGAATCTG CCCAGGGTCC AGCTGCTGCT CATACTACTG 
ATACTGCTGG GTATAATGCA ATGGAAGAAA GTGTGAGCAG GGAGAAGCCA GAATTGACAG 
CTTCAACAGA AAGGGTCAAC AAAAGAATGT CCATGGTGGT GTCTGGCCTG ACCCCAGAAG 
AATTTATGCT CGTGTACAAG TTTGCCAGAA AACACCACAT CACTTTAACT AATCTAATTA 
CTGAAGAGAC TACTCATGTT GTTATGAAAA CAGATGCTiGA GTTTGTGTGT GAACGGACAC 
TGAAATATTT TCTAGGAATT GCGGGAGGAA AATGGGTAGT TAGCTATTTC TGGGTGACCC 
AGTCTATTAA AGAAAGAAAA ATGCTGAATG AGCATGATTT TGAAGTCAGA GGAGATGTGG 
TCAATGGAAG AAACCACCAA GGTCCAAAGC GAGCAAGAGA ATCCCAGGAC AGAAAGATCT 
TCAGGGGGCT AGAAATCTGT TGCTATGGGC CCTTCACCAA CATGCCCACA GATCAACTGG 
AATGGATGGT ACAGCTGTGT GGTGCTTCTG TGGTGAAGGA GCTTTCATCA TTCACCCTTG 
GCACAGGTGT CCACCCAATT GTGGTTGTGC AGCCAGATGC CTGGACAGAG GACAATGGCT 
TCCATGCAAT TGGGCAGATG TGTGAGGCAC CTGTGGTGAC CCGAGAGTGG GTGTTGGACA 
GTGTAGCACT CTACCAGTGC CAGGAGCTGG ACACCTACCT GATACCCCAG ATCCCCCACA 
GCCACTACTG A 

(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1863 amino acids 

(B) TYPE: amino acid 

(C) STRAND EDNESS : not relevant 

(D) TOPOLOGY: .not relevant 
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(ii) MOLECULE TYPE: protein 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(B) STRAIN: BRCA1 

(viii) POSITION IN GENOME: 

(A) CHROMOSOME/ SEGMENT: 17 

(B) MAP POSITION: 17q21 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

Met Asp Leu Ser Ala Leu Arg Val Glu Glu Val Gin Asn Val He Asn 
1 5 .10 15 

Ala Met Gin Lys He Leu Glu Cys Pro lie .Cys Leu Glu Leu He Lys 
20 25 30 

Glu Pro Val Ser Thr Lys Cys Asp His lie Phe Cys Lys Phe Cys Met 
35 40 45 

Leu Lys Leu Leu Asn Gin Lys Lys Gly Pro Ser Gin Cys Pro Leu Cys 
50 55 60 

Lys Asn Asp He Thr Lys Arg Ser Leu Gin Glu Ser Thr Arg Phe Ser 
65 70 75 go 

Gin Leu Val Glu Glu Leu Leu Lys He He Cys Ala Phe Gin Leu Asp 

85 go 95 

Thr Gly Leu Glu Tyr Ala Asn Ser Tyr Asn Phe Ala Lys Lys Glu Asn 
100 105 no 

Asn Ser Pro Glu His Leu Lys Asp Glu Val Ser He He Gin Ser Met 

120 125 

Gly Tyr Arg Asn Arg Ala Lys Arg Leu Leu Gin Ser Glu Pro Glu Asn 
130 135 140 
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Pro Ser.Leu Gin Glu Thr Ser Leu Ser- Val oin Leu Ser Asn Leu Gly 

•' " 155 160 

Thr Val Arg Thr Leu Arg Thr Lys Gin Arg Il e Gln Pro Gln Lys Thr 
165 170 



Ser Val Tyr lie Glu Leu Gly Ser Asp Ser Ser Glu Asp Thr 



180 



185 



175 



Val Asn 



190 



Lys Ala Thr Tyr Cys Ser Val Gly Asp Gln Glu 



195 



200 



Leu Leu Gln lie Thr 



205 



Pro Gln Gly Thr Arg Asp Glu Ile Ser Leu Asp Ser Ala 



210 



215 



Lys Lys Ala 



220 



Ala Cys Glu Phe Ser Glu Thr Asp Val Thr Asn Thr Glu Hi 



225 



230 



235 



His His Gln 
240 



Pro Ser Asn Asn Asp Leu Asn Thr Thr Glu Lys Arg Ala Ala Glu Arg 



245 



250 



His Pro Glu Lys Tyr Gln Gly Ser Ser Val Ser Asn 



260 



255 



Leu His Val Glu 



265 270 
Pro Cys Gly Thr Asn Thr His Ala Ser Ser Leu. Gln His Glu Asn Ser 



275 



280 285 
Ser Leu Leu Leu Thr Lys Asp Arg Met Asn Val Glu Lys Ala Glu Phe 



290 



295 



300 



Cys Asn Lys Ser Lys Gln Pro Gly Leu Ala Arg Ser Gln His Asn Arg 



305 



310 



315 



320 



Trp Ala Gly Ser Lys Glu Thr Cys Asn Asp Arg Arg Thr Pro Ser Thr 



325 



330 



Glu Lys Lys Val Asp Leu Asn Ala Asp Pro Leu Cys Glu Arg 



340 



335 



Lys Glu 



345 



350 



80 



■*sb ^fen^ ; e ■« u .a c to li e:: 



Trp Asn Lys Gin Lys Leu Pro Cys Ser Glu Asn Pro Arg Asp Thr Glu 
355 360 . 

Asp Val P ro Trp Ile Thr Leu Asn Ser Ser U e Gln L ys Val Asn Qlu 
370 37 5 380 

Trp Phe Ser Arg Ser Asp Glu Leu Leu Gly Ser Asp Asp Ser His Asp 

J95 400 

Gly Glu Ser Glu Ser Asn Ala Lys Val Ala Asp Val Leu Asp Val Leu 



405 



410 



415 



Asn Glu Val Asp Glu Tyr Ser Gly Ser Ser Glu Lys lie Asp Leu Leu 
420 «5 430 

Ala Ser Asp Pro His Glu Ala Leu lie Cys Lys Ser Glu Arg Val His 
435 440 4 45 

Ser Lys Ser Val Glu Ser Asn lie Glu Asp Lys lie Phe Gly Lys Thr 
450 455 460 

Tyr Arg Lys Lys Ala Ser Leu Pro Asn Leu Ser His Val Thr Glu Asn 

465 470 47s 

475 . 480 

Leu He He Gly Ala Phe Val Thr Glu Pro Gin lie lie Gin Glu Arg 
485 490 495 

Pro Leu Thr Asn Lys Leu Lys Arg Lys Arg Arg Pro Thr Ser Gly Leu 
500 505 510 

His Pro Glu Asp Phe lie Lys Lys Ala Asp Leu Ala Val Gin 
515 520 525 

Pro Glu Met lie Asn Gin Gly Thr Asn Gin Thr Glu Gin Asn Gly Gin 



Lys Thr 
525 



530 535 



540 



Val Met Asn Ile Thr Asn Ser Gly His Glu Asn Lys Thr Lys Gly Asp 



545 550 555 



560 



Ser He Gin Asn Glu Lys Asn Pro Asn Pro Ile Glu Ser Leu Glu Lys 
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<*« Ser »u Phe , ys Thr Lys „. ^ „. sat iu 
»» - t Leu 01u L . u Asa „. £ ^ ^ ^ "0 ^ ^ 

600 605 

Asn Ar, ,eu Ar, Ar, lys ser sec T „ r Arj iu au l ^ 

615 620 

£ Val v.l Ser Ar 3 ^ Leu ser p „ p „ ^ ^ l ^ 

IU Asp ser eye s.r ser ser oi» 01u Ile Lys Lys Lys ty , ^ ^ 

" 5 «° 655 

Qln Met Pro v.! Ar g „ ls Ser Arg As „ ^ ^ ^ ^ ^ 
660 «S 670 

" 5 S «° 6S5 

Ser ^ at, His ^ Ser ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ 

695 700 

AU Pro 01y ser Ph. Thr Lys Cys Ser A sn Thr Ser alu L . u Lys Glu 

710 7i s 

/15 720 

Phe Val Asn Pro s Leu ptQ ^ Glu ^ ^ ^ 



725 



730 



735 



Thr Val Lys Val Ser Asn Asn Ala Glu Asp Pro Lys 



740 



745 



Asp Leu Met Leu 
750 



Ser Gly Glu Ar g Val Leu Gin Thr Glu Arg Ser Val Glu Ser Ser Ser 
755 7 60 765 



He Ser Leu Val Pro Gly Thr Asp Tyr Gly Thr Gin Glu Ser lie Ser 
770 77 5 780 
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s 

800 



- 795 80 

Cys val Ser Gin Cys Ala Ala Phe Glu Asn Pro Lys Gly Leu „. Hi 

805 810 815 

«ly Cys Ser Lys Asp Asn Arg Asn Asp Thr Glu Gly Phe Lys Tyr Pr , 



825 



830 



Leu Gly His Glu Val Asn His Ser Arg Glu 



835 



840 



Thr Ser He Glu Met Glu 



845 



«. s.r ,i« ,. u Asp Ali ^ ^ L . u Gln ^ ^ L ^ v ^ ^ 



855 



860 



*r 3 Oln 3 « Ph. Ala teu phe isn pro ^ ^ 

370 875 880 



Olu C y s A l. Thr Ph. s.r Ala „ i5 ser 8ly s „ leu Lys ^ 

890 ggc 

Pro Ly, w Thr Ph. «. Cy3 01u Gln tys 01u ^ ^ ^ 

900 one 

910 



905 



AS„ „„ Set ^ Ile Lys pro val %l ^ ^ 

5 920 925 



935 



940 



Ser lie Lys Gly Gly Ser Arg Phe Cys Leu 



945 



950 



Ser Ser Gin Phe Arg Gly 



955 



960 



Asn Glu Thr Gly Leu lie Thr Pro Asn Lys His Gly 



965 



970 



Leu Leu Gln Asn 
975 



Pro Tyr Arg lie Pro Pro Leu Phe Pro lie Lys Ser Phe Val Lys Thr 



980 



985 



990 



Lys Cys Lys Lys Asn Leu Leu Glu Glu Asn Phe Glu Glu 



995 



1000 



His Ser Met 



1005 
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^ GIU Mec „ Qlu . Asn ue sm ^ ^ 

1015 ' 1020 

* u. h. ^ ssn £ Ile ^ 01u Asn vil phe ^ ^ 

103S 

JS 1040 
S.r Asn tl . A lu val Gly s „ sm ^ ^ v ^ ^ ^ 

1050 1055 



Ser He Asn Glu He Glv s@r ~i , 

1060 f^ G1U 116 Gln A1 * G ^ 

1065 1070 

«» ^ Pro Ly s ^ to Ala Met ^ ^ ^ 

5 . 1<l8 ° 1085 

" 0 ° 1U ^ G1 ° - -» S « As, cy. tos 

i095 1100 

». pro o lu a. to £ «„ »„ ^ 01u Glu val val Mn ^ 

1110 1115 

11X5 112( 

Asn Thr Asp Phe Ser Pro Tyr Leu Ile s»r- a „ , T 

y Ile Ser As P Asn Leu qi u Gin Pro 

25 1130 1135 

Me t ci y Ser s ig Ala ser v ^ ^ %r ^ ^ 

40 1145 1150 

Leu Leu Asp Asp Gly Glu n . Lyg Qlu ^ ^ ^ ^ ^ ^ ^ 

1160 1165 

asp n. L „ 01u s , r Ser Ala VaJ pha ser ^ ^ ^ ^ 



1175 



1180 



«« L.u s„ s .r Pro s. r Pro Ph. Thc „ is Tht His ^ „, ^ 

1195 

Cly Tyr Arg Ar g Gly Ala Lys Lys Leu Qlu ^ ^ ^ ^ »°< 

1205 12 1<> 1215 

Ser Ser Glu Asp Glu Glu Leu Pro Cy S Phe Gln His Leu Leu Phe Gly 
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1220 1225 

^ 5 1230 



Lys Val Asn Asn He Pro 

1235 " Thr Arg His S.r Thr Val Ala 

1240 1245 

Thr Glu Cys Leu Ser Lys Asn Thr Glu Glu Asn Leu r e 

1250 .... 43,1 Leu Leu Ser Leu Lys 

1260 

1275 

3 • 128C 

«* H is „ is L s « 01u 01u ^ ^ ^ ^ ^ ^ 



1285 1290 

^ U 1295 



S « Gln e„ s « Glu leu ^ ^ au ^ ^ ^ ^ 

305 1310 



1320 1325 



«» Ser «. Gly vax = ly L set „ Lys ^ ^ A ^ 

" J:5 1340 

«» ««■ ^ Thr „ leu Glu 01u Asn ^ ^ ^ ^ 

0 1355 

A " 5 1361 

« ser A 5a L aly Glu AU &u ^ ^ Sm ^ u ^ 

1370 

iJ/0 1375 
Ser Val Ser Glu Asp cvs s«r n„ , „ 

1380 Gln ^ r ASP Ile ^u 

1385 1390 

Thr Thr «, «. AT, Asp Thr „. t sla Hls isn ^ „. ^ t 

Gln Glu Met Ala Glu Leu Glu Al 
1410 

Pro Ser Asn Ser Tyr Pro s?r ti« t1 

1425 ; d , n 16 116 Ser ASP Ser Ser Ala. Leu Glu 

° 1435 1440 



1395 14nn ~" iie L * s L eu Gln 

i4UU 1405 

1410 , A1S 3 ^ L6U G1U Gln His G1 V Ser Gln 

1415 1420 
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ASP L.u Ar, As» Pre G lu G1 „ Sec Thr . s „ ^ ^ ^ ^ ^ 

■ 5 -1450 

"° 1455 

Ser Gin Lys Ser Ser Glu Tyr Pro lis s«=r n n » 

1460 - "Ser Gl„ A sn Pro Glu Gly Leu 

1455 1470 

Ser Ala Asp Lys Phe Glu Val Ser ^ g 

1475 1/lQn ber L VS Asn 

1480 148s 

1500 

Asp Asp Ar, Trp ^ „ is s „ ^ Gly ^ ^ 

0 1515 

3X3 1520 

Asn Tvr Pro s „ el Glu Lau IU ^ ^ v ^ s ^ 

1525 1530 

i3JU 1535 

Gin Gin Leu Glu Glu Ser Glv Pro h<« r 

1540 G1U Thr Ser ^ 

1545 1550 

L«u Pro Ar 3 G l n ^ Leu Glu 01y ^ ^ ^ ^ ^ ^ 

" 1560 1565 

~ Uj, Phe Sar Asp Asp Pro G l u ser isp p „ SM 01o As . 



1575 



1580 



Pro «» Ser Ala ^ isn ne sk ^ ^ 

oyu 1595 

1600 



VI Pro «. L. a , ya val Ala oi» Ser Ala G ln G l, Pro Ala AU 
1605 1610 1615 

Ala Hi, Thr ^ Asp Thc AU Gly ^ ^ ^ ^ ^ 

" 20 lS " l630 

ser Ar g olu , y s Pro „» Leu Thr Ala s.r Thr „„ Ar, Val As„ Lys 

1640 1645 
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Arg Met Ser Met Val Val Ser Gl v r 

1650 SSr Gly Leu T hr Pro Gl u Glu 



1655 



1660 



Phe Met Leu 



Val L Vs Phe Ala Arg 



1665 



1670 



Lys His His II 



e Thr Leu Thr Asn Leu n 



1675 



Thr Glu Glu Thr tv.» u- 

Thr Thr Hxs Val Val Met Lys Thr 



e 

1680 



1685 

Cys Glu Arg Thr Leu Lys 
1700 



1690 

Tyr Phe Leu Gly n 



Asp Ala Glu Phe Val 
1695 



1705 . 6 AU Gly l» Trp 

1710 • 



Val Val Ser Tyr Phe T m » , 

1715 ^ 1 Thr Gln Ser «• Lys 



1720 



Glu Arg Lys Met 



1725 



ASn - G1U His Asp Phe Glu v al a™ g1v a , 

1730 1735 Y SP Val Asn ^y Arg 



1740 



.745 

17S0 

Phe Arg Gly Leu Glu Il e 
1765 



a Arg Glu Ser Gin 
1755 



Asp Arg Lys JU 



1760 



C» Cys Tyr Sly Pro pha Tht ^ 



1770 



Thr Asp Gln Leu Glu 

1780 " LSU Cys Al 



1785 



1775 



a Ser Val Val 
1790 



IW Glu Leu Ser Ser Phe Thr Leu Gly Thr «, „ , 

U Gly Thr Gly ^1 His Pro Ile Val 

Val Val Gln Pro Asp Ala Trp Thr Glu a . 18 ° 5 

1810 ..?. G1U Asp Asn Gly Phe His Ala n e 

1820 

Gly Gln Met Cys Glu Ala 



1825 



1830 



1800 

1815 

Pro Val Val Thr Arg Glu 



1835 



Trp Val Leu Asp 



1840 



Ser Val Ala Leu Tyr Gln Cys Gln Glu r 

1845 ASP Thr t-u He Pro 

1850 . 1355 

Gln He Pro His Ser. His Tyr 
1860 
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(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 
(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(B) STRAIN: . 2F primer 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:7: 
GAAGTTGTCA TTTTATAAAC CTTT 
(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH;. . 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(B) STRAIN: 2R primer 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 8 : 
TGTCTTTTCT TCCCTAGTAT GT 
(2) INFORMATION FOR SEQ ID NO: 9: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) ' STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear ■ 

(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(B) STRAIN: 3F primer 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 9 : 

TCCTGACACA GCAGACATTT A .2: 
(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(B) STRAIN: 3R primer 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 
TTGGATTTTC GTTCTCACTT A .21- 
(2) INFORMATION FOR SEQ ID NO: 11: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(B) STRAIN: 5F primer 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 
CTCTTAAGGG CAGTTGTGAG 20 

(2) INFORMATION FOR SEQ ID NO:12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

.(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(B) STRAIN: 5R-M13* primer 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 
TTCCTACTGT GGTTGCTTCC 2 0 

(2) INFORMATION FOR SEQ ID NO: 13: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(B) STRAIN: 6 /IF primer 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 
CTTATTTTAG TGTCCTTAAA AGG 

2 

(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 
(B) STRAIN: 6R 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 
TTTCATGGAC AGCACTTGAG TG 

22 

(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23. base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: not relevant 
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(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(B) STRAIN: 7F primer 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 
CACAACAAAG AGCATACATA GGG 
(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

■(C) STRANDEDNESS : not relevant 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(B) STRAIN: 6/7R primer 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

TCGGGTTCAC TCTGTAGAAG .20 
(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

( B ) TYPE: nucleic acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: DNA (genomic) 



(vi) ORIGINAL SOURCE: 

(B) STRAIN: 8F1 primer 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17:. 

TTCTCTTCAG GAGGAAAAGC A 21 
(2) INFORMATION FOR SEQ ID NO: 18': 

(i) SEQUENCE CHARACTERISTICS: ■• 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: . 

(B) STRAIN: 8R1 primer 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 
GCTGCCTACC ACAAATACAA A 21 
(2) INFORMATION FOR SEQ . ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
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(vi) ORIGINAL SOURCE: 

(B) STRAIN:, 9F primer 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 
CCACAGTAGA TGCTCAGTAA ATA 

(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS : not 'relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(B) STRAIN: 9R primer 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 
TAGGAAAATA CCAGCTTCAT AGA 

(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2.0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY:" linear 

(ii) MOLECULE TYPE: DNA (genomic) 
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(vi) ORIGINAL SOURCE: 

(B) STRAIN : 10F primer 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 
TGGTCAGCTT TCTGTAATCG ' L 
(2) INFORMATION FOR SEQ ID NO: 22: 
(i) SEQUENCE CHARACTERISTICS: 



(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

'(B) STRAIN: 10R primer 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:22: 



(2) INFORMATION FOR SEQ ID NO: 23: 



(A) LENGTH: 19 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(vi) ORIGINAL SOURCE: 



(A) LENGTH: 24 base pairs 

(B) TYPE:, nucleic acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear . 



GTATCTACCC ACTCTCTTCT TCAG 



24 



(i) 



SEQUENCE CHARACTERISTICS: 
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(B) STRAIN: 11AF primer 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO.:23: 



CCACCTCCAA GGTGTATCA 



19 



(2) INFORMATION FOR SEQ ID NO: 24: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(B) STRAIN: 11AR primer 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 
TGTTATGTTG GCTCCTTGCT .20 
(2) INFORMATION FOR SEQ ID NO: 25:- 
(i) SEQUENCE CHARACTERISTICS: 



(B) TYPE: nucleic acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(B) STRAIN: 11BF1 primer 



(A) LENGTH: 22 base pairs 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO.-25: 
CACTAAAGAC AGAATGAATC TA 
(2) INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(B) STRAIN: 11BR1 primer 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 

GAAGAACCAG AATATTCATC TA 

(2) INFORMATION FOR SEQ ID NO: 27: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(B) STRAIN: 11CF1 primer 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 
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TGATGGGGAG TCTGAATCAA 



(2) INFORMATION FOR SEQ ID NO: 28: 



(i) 



SEQUENCE CHARACTERISTICS : 



(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : not relevant 

( D) TOPOLOGY : linear 

(ii). MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE : 

(B) STRAIN: 11CR1 primer 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 
TCTGCTTTCT TGATAAAATC CT 22 
(2) INFORMATION FOR SEQ ID NO: 29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(B) STRAIN: 11DF1 primer 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:29: 
AGCGTCCCCT CACAAATAAA 
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(2) INFORMATION FOR SEQ ID NO: 30: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

<B) STRAIN: 11DR1 primer 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30; 
TCAAGCGCAT GAATATGCCT 20 
(2) INFORMATION FOR SEQ ID NO: 31: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH:' 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(B) STRAIN: 11EF primer 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 
GTATAAGCAA TATGGAACTC GA 22 
(2) INFORMATION FOR SEQ ID NO: 32: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 23 base pairs ' 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS: not relevant - 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL. SOURCE: 

(B) STRAIN: HER primer. 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 

TTAAGTTCACT GGTATTTGAA CA 

(2) INFORMATION FOR SEQ ID NO: 33: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : not relevant 
(D> TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(B) STRAIN: 11FF primer 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 

GACAGCGATA CTTTCCCAGA 20 

(2) INFORMATION FOR SEQ ID NO: 34: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 21 base pairs 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS: not relevant 
(P) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

.(B) STRAIN: 11FR primer 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO 
TGGAACAACC ATGAATTAGT C 
(2) INFORMATION FOR SEQ ID NO: 35: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE:. 

(B) STRAIN: 11GF primer 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 
GGAAGTTAGC ACTCTAGGGA 

(2) INFORMATION FOR SEQ ID NO: 36: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: not relevant 
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(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: DMA (genomic) 

(vi) ORIGINAL SOURCE: 

(B) STRAIN: 11GR primer 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: 
GCAGTGATAT TAACTGTCTG TA 
(2) INFORMATION FOR SEQ ID NO: 37: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(B) STRAIN: 11HF primer 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:37: 
TGGGTCCTTA AAGAAACAAA GT 
(2) INFORMATION FOR SEQ ID NO: 38: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: linear 



22 



22 
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(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(B) STRAIN: 11HR primer 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38 : 
TCAGGTGACA TTGAATCTTC C 

21 

(2) INFORMATION FOR SEQ ID NO: 39: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(B) STRAIN: 11IF primer 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39: 
CCACTTTTTC CCATCAAGTC A 

21 

(2) INFORMATION FOR SEQ ID NO: 40: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(vi) ORIGINAL SOURCE: 
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(B) STRAIN: 11IR primer 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:40: 
TCAGGATGCT TACAATTACT TC 

22 

(2) INFORMATION FOR SEQ ID NO: 41: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(i'i) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(B) STRAIN: 11JF primer 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 41: 
CAAAATTGAA TGCTATGCTT AGA 

(2) INFORMATION FOR SEQ ID NO: 42: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(B) STRAIN: 11JR primer 
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(xi) SEQUENCE DESCRIPTION: SEQ -ID NO: 42: 
TCGGTAACCC TGAGCCAAAT 

2( 

(2) INFORMATION FOR SEQ ID NO: 43: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(B) STRAIN: 11KF primer 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:43: 
GCAAAAGCGT CCAGAAAGGA 

20 

(2) INFORMATION FOR SEQ ID NO: 44: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: not relevant ' 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(B) STRAIN: 11KR-1 primer 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 44: 
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TATTTGCAGT CAAGTCTTCC AA ■ ' . 

(2) INFORMATION FOR SEQ ID NO: 45: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) . STRANDEDNESS : not relevant 
.(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(B) STRAIN: 11LF-1 primer 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 45: 
GTAATATTGG CAAAGGCATC T 
(2) INFORMATION FOR SEQ ID NO: 46: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(B) STRAIN: 11LR primer 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 46: 
TAAAATGTGC TCCCCAAAAG CA 
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(2) INFORMATION FOR SEQ ID NO: 47: 

(i) SEQUENCE CHARACTERISTICS; - 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

<B) STRAIN: 12F primer 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 47: 
GTCCTGCCAA TGAGAAGAAA 

20 

(2) INFORMATION FOR SEQ ID NO: 48: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(B) STRAIN: 12R primer 



(xi) SEQUENCE DESCRIPTION: SEQ . ID NO:48: 
TGTCAGCAAA CCTAAGAATG T 
(2) INFORMATION FOR SEQ ID NO: 49: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(B) STRAIN: 13F primer 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 49: 
AATGGAAAGC TTCTCAAAGT A 
(2) INFORMATION FOR SEQ ID NOrSO: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(B). STRAIN: 13R primer 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 50: 

ATGTTGGAGC TAGGTCCTTA C 

(2) INFORMATION FOR SEQ ID NO: 51: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: not relevant 



21 



21 
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(D) TOPOLOGY: linear 

(ii). MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(B) STRAIN: 14F primer 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 51: 
CTAACCTGAA TTATCACTAT CA 22 



(2) INFORMATION FOR SEQ ID NO: 52: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(vi) ORIGINAL SOURCE: 
(B) STRAIN: 14R primer 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 52: 
GTGTATAAAT GCCTGTATGC A 21 
INFORMATION FOR SEQ ID NO: 53: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(B) STRAIN: 15F primer . 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 53: 



TGGCTGCCCA GGAAGTATG 



19 



(2) INFORMATION FOR SEQ ID NO: 54: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : not relevant 
( D } TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(B) STRAIN: 15R primer 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 54: 



AACCAGAATA TCTTTATGTA GGA 



23 



(2) INFORMATION FOR SEQ ID NO: 55: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



110 



(vi) ORIGINAL SOURCE: 

(B) STRAIN: 16F primer 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO 
AATTCTTAAC AGAGACCAGA AC 
(2) INFORMATION FOR SEQ ID NO: 56: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE:, nucleic acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(B) STRAIN: 16R primer 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO 

AAAACTCTTT CCAGAATGTT GT 

( 2 ) INFORMATION FOR SEQ ID NO : 57 : 
(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(B) STRAIN:' 17F primer 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO:-57: 
GTGTAGAACG TGCAGGATTG 
(2) INFORMATION FOR SEQ ID NO: 58: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(B) STRAIN: 17R primer 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:58: 

TCGCCTCATG TGGTTTTA 18 
(2) INFORMATION FOR SEQ ID NO: 59: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(B) STRAIN: 18F primer 



112. 





'1?**^ ""i 11 ™* h i*' p ■«'*"*' tl *^h 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 59: 



(2). INFORMATION FOR SEQ 'ID NO:60: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(B) STRAIN: 18R primer 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 60: 



GAGACCATTT TCCCAGCATC 
(2) INFORMATION FOR SEQ ID NO: 61: 

( i ) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(B) STRAIN: 19F primer 



(xi) SEQUENCE DESCRIPTION': SEQ ID NO: 61: 
CTGTCATTCT TCCTGTGCTC 



GGCTCTTTAG CTTCTTAGGA C 



21 
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. (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 57: 
GTGTAGAACG TGCAGGATTG 
(2) INFORMATION FOR SEQ ID NO: 58: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(B) STRAIN: 17R primer 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 58: 
TCGCCTCATG TGGTTTTA 

(2) INFORMATION FOR SEQ ID NO: 59: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: not relevant - 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(B) STRAIN: 18F primer 
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(i) 



SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: not relevant 



(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(B) STRAIN: 20R primer 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 64: 
GGGAATCCAA ATTACACAGC 
(2) INFORMATION FOR SEQ ID NO: 65: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 



(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(B) STRAIN: 2 IF primer 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 65: 

AAGCTCTTCC TTTTTGAAAG TC 

(2) INFORMATION FOR SEQ ID NO: 66: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) ' TYPE: nucleic acid 



(C) STRANDEDNESS: not relevant 
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(C) STRANDEDNESS: not relevant 
(D\ TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(B) STRAIN: 21R primer 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 66: 
GTAGAGAAAT AGAATAGCCT CT 
(2) INFORMATION FOR SEQ ID NO: 67: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(B) STRAIN: 22F primer 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 67: 
TCCCATTGAG AGGTCTTGCT 
(2) INFORMATION FOR SEQ ID NO: 68: 

(i). SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: riot relevant 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(B) STRAIN: 22R primer 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 68: 

GAGAAGACTT CTGAGGCTAC 

(2) INFORMATION FOR SEQ ID NO: 69: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(3) STRAIN: 23F-1 primer. 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 69: 
TGAAGTGACA GTTCCAGTAG T 
(2) INFORMATION FOR SEQ ID NO: 70 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

■ (C) STRANDEDNESS: not relevant 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
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(vi) ORIGINAL SOURCE: 

(B) STRAIN: 23R-1 primer 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 70: 



CATTTTAGCC ATTCATTCAA CAA 
(2) INFORMATION FOR SEQ ID NO: 71: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

( vi ) ORIGINAL SOURCE : 

(B) STRAIN: 24F primer 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 71: 

ATGAATTGAC ACTAATCTCT GC 22 

(2) INFORMATION FOR SEQ ID NO: 72: . 
(i) SEQUENCE CHARACTERISTICS : ! 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(B) STRAIN: 24R primer 
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